Show simple item record

The Essential Dynamics Algorithm: Essential Results

dc.date.accessioned2004-10-08T20:38:57Z
dc.date.accessioned2018-11-24T10:21:40Z
dc.date.available2004-10-08T20:38:57Z
dc.date.available2018-11-24T10:21:40Z
dc.date.issued2003-05-01en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/6718
dc.identifier.urihttp://repository.aust.edu.ng/xmlui/handle/1721.1/6718
dc.description.abstractThis paper presents a novel algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces that trades speed for accuracy. A transform of the stochastic MDP into a deterministic one is presented which captures the essence of the original dynamics, in a sense made precise. In this transformed MDP, the calculation of values is greatly simplified. The online algorithm estimates the model of the transformed MDP and simultaneously does policy search against it. Bounds on the error of this approximation are proven, and experimental results in a bicycle riding domain are presented. The algorithm learns near optimal policies in orders of magnitude fewer interactions with the stochastic MDP, using less domain knowledge. All code used in the experiments is available on the project's web site.en_US
dc.format.extent12 p.en_US
dc.format.extent1085830 bytes
dc.format.extent303781 bytes
dc.language.isoen_US
dc.subjectAIen_US
dc.subjectReinforcement learningen_US
dc.subjectbicycleen_US
dc.subjectpolicy searchen_US
dc.subjectmarkov decision processesen_US
dc.titleThe Essential Dynamics Algorithm: Essential Resultsen_US


Files in this item

FilesSizeFormatView
AIM-2003-014.pdf303.7Kbapplication/pdfView/Open
AIM-2003-014.ps1.085Mbapplication/postscriptView/Open

This item appears in the following Collection(s)

Show simple item record