Towards Feature Selection In Actor-Critic Algorithms

dc.date.accessioned	2007-11-13T14:45:30Z
dc.date.accessioned	2018-11-24T10:25:47Z
dc.date.available	2007-11-13T14:45:30Z
dc.date.available	2018-11-24T10:25:47Z
dc.date.issued	2007-11-01	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/39427
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/1721.1/39427
dc.description.abstract	Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a well-studied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum.	en_US
dc.format.extent	9 p.	en_US
dc.relation	Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory	en_US
dc.relation		en_US
dc.subject	reinforcement learning	en_US
dc.title	Towards Feature Selection In Actor-Critic Algorithms	en_US

Files in this item

Files	Size	Format	View
MIT-CSAIL-TR-2007-051.pdf	189.1Kb	application/pdf	View/Open
MIT-CSAIL-TR-2007-051.ps	668.1Kb	application/postscript	View/Open