Show simple item record

Exploration in Gradient-Based Reinforcement Learning

dc.date.accessioned2004-10-04T14:37:39Z
dc.date.accessioned2018-11-24T10:11:48Z
dc.date.available2004-10-04T14:37:39Z
dc.date.available2018-11-24T10:11:48Z
dc.date.issued2001-04-03en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/6076
dc.identifier.urihttp://repository.aust.edu.ng/xmlui/handle/1721.1/6076
dc.description.abstractGradient-based policy search is an alternative to value-function-based methods for reinforcement learning in non-Markovian domains. One apparent drawback of policy search is its requirement that all actions be 'on-policy'; that is, that there be no explicit exploration. In this paper, we provide a method for using importance sampling to allow any well-behaved directed exploration policy during learning. We show both theoretically and experimentally that using this method can achieve dramatic performance improvements.en_US
dc.format.extent5594043 bytes
dc.format.extent516972 bytes
dc.language.isoen_US
dc.titleExploration in Gradient-Based Reinforcement Learningen_US


Files in this item

FilesSizeFormatView
AIM-2001-003.pdf516.9Kbapplication/pdfView/Open
AIM-2001-003.ps5.594Mbapplication/postscriptView/Open

This item appears in the following Collection(s)

Show simple item record