Compositional Policy Priors

dc.date.accessioned	2013-04-18T00:45:04Z
dc.date.accessioned	2018-11-26T22:26:57Z
dc.date.available	2013-04-18T00:45:04Z
dc.date.available	2018-11-26T22:26:57Z
dc.date.issued	2013-04-12
dc.identifier.uri	http://hdl.handle.net/1721.1/78573
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/1721.1/78573
dc.description.abstract	This paper describes a probabilistic framework for incorporating structured inductive biases into reinforcement learning. These inductive biases arise from policy priors, probability distributions over optimal policies. Borrowing recent ideas from computational linguistics and Bayesian nonparametrics, we define several families of policy priors that express compositional, abstract structure in a domain. Compositionality is expressed using probabilistic context-free grammars, enabling a compact representation of hierarchically organized sub-tasks. Useful sequences of sub-tasks can be cached and reused by extending the grammars nonparametrically using Fragment Grammars. We present Monte Carlo methods for performing inference, and show how structured policy priors lead to substantially faster learning in complex domains compared to methods without inductive biases.	en_US
dc.format.extent	17 p.	en_US
dc.title	Compositional Policy Priors	en_US

Files in this item

Files	Size	Format	View
MIT-CSAIL-TR-2013-007.pdf	591.7Kb	application/pdf	View/Open