The Essential Dynamics Algorithm: Essential Results

dc.date.accessioned	2004-10-08T20:38:57Z
dc.date.accessioned	2018-11-24T10:21:40Z
dc.date.available	2004-10-08T20:38:57Z
dc.date.available	2018-11-24T10:21:40Z
dc.date.issued	2003-05-01	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/6718
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/1721.1/6718
dc.description.abstract	This paper presents a novel algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces that trades speed for accuracy. A transform of the stochastic MDP into a deterministic one is presented which captures the essence of the original dynamics, in a sense made precise. In this transformed MDP, the calculation of values is greatly simplified. The online algorithm estimates the model of the transformed MDP and simultaneously does policy search against it. Bounds on the error of this approximation are proven, and experimental results in a bicycle riding domain are presented. The algorithm learns near optimal policies in orders of magnitude fewer interactions with the stochastic MDP, using less domain knowledge. All code used in the experiments is available on the project's web site.	en_US
dc.format.extent	12 p.	en_US
dc.format.extent	1085830 bytes
dc.format.extent	303781 bytes
dc.language.iso	en_US
dc.subject	AI	en_US
dc.subject	Reinforcement learning	en_US
dc.subject	bicycle	en_US
dc.subject	policy search	en_US
dc.subject	markov decision processes	en_US
dc.title	The Essential Dynamics Algorithm: Essential Results	en_US

Files in this item

Files	Size	Format	View
AIM-2003-014.pdf	303.7Kb	application/pdf	View/Open
AIM-2003-014.ps	1.085Mb	application/postscript	View/Open