報酬重み付き密度推定による階層強化学習

長隆之; 杉山将; 長隆之; 杉山将

文献

J-GLOBAL ID：201702227733445097 整理番号：17A1913667

報酬重み付き密度推定による階層強化学習

Hierarchical Reinforcement Learning Based on Return-Weighted Density Estimation

出版者サイト複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=17A1913667&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=17A1913667&from=J-GLOBAL&jstjournalNo=S0532B") }}

著者 (4件)： , , ,
資料名：
巻： 117 号： 293(IBISML2017 35-89) ページ： 243-249 発行年： 2017年11月02日
JST資料番号： S0532B ISSN： 0913-5685 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

複数のモードを持つ報酬関数に対して最適な方策を学習するための階層強化学習の手法を提案する。階層強化学習においては,複数の下層方策を学習し,与えられた条件に応じて適切な下層方策を選択する上層方策を学習することが必要になる。提案手法では,報酬重み付き密度推定を介して下層方策の数および配置を自動的に決定する。本研究では,軌道計画などのタスクに提案する階層強化学習法を適用し,その性能を示す。(著者抄録)

, , , , , , , , ,
, , , ,

人工知能

引用文献 (19件)：

Cutkosky, M. R., and Howe, R. D. 1990. Human grasp choice and robotic grasp analysis. In Venkataraman, S. T., and Iberall, T., eds., Dextrous Robot Hands. Springer-Verlag New York, Inc. 5-31.
Osa, T.; Peters, J.; and Neumann, G. 2016. Experiments with hierarchical reinforcement learning of multiple grasping policies. In Proceedings of the International Symposium on Experimental Robotics (ISER).
Daniel, C.; Neumann, G.; Kroerner, O.; and Peters, J. 2016. Hierarchical relative entropy policy search. Journal of Machine Learning Research 17:1-50.
Dietterich, T. G. 2000. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13:227-303.
Bacon, P. L.; Harb, J.; and Precup, D. 2017. The option-critic architecture. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

, , , ,

前のページに戻る