サブゴールによる内発的報酬を用いたモデルベース深層強化学習の考察

丸山元輝; 遠藤聡志; 山田孝治

文献

J-GLOBAL ID：202002283965993954 整理番号：20A2181707

サブゴールによる内発的報酬を用いたモデルベース深層強化学習の考察

A Study on Model-based Deep Reinforcement Learning with Intrinsic Subgoal Reward

出版者サイト {{ this.onShowPLink() }} 複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=20A2181707&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=20A2181707&from=J-GLOBAL&jstjournalNo=L4664A") }}

著者 (3件)： , ,
資料名：
巻： 19th 号：第2分冊ページ： 47-50 発行年： 2020年08月18日
JST資料番号： L4664A 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

・モデルベースの深層強化学習において,エージェントが自律的にサブゴールを推定し,学習するアルゴリズムの開発が目標。
・深層生成モデルによる先読みとサブゴールの類似度から報酬を生成する方法と,サブゴールに対して内発的報酬を与える方法を提案。
・サブゴールを設定することで学習の収束速度に貢献し,自律的にサブゴールを設定することでも収束することを確認。

, , , , , , , ,
, , , ,

人工知能

引用文献 (9件)：

Steven Kapturowski, Georg Ostrovski, John Quan, Rémi Munos, and Will Dabney. Recurrent experience replay in distributed reinforcement learning. In 7th International Conference on Learning Representations, ICLR 2019, 2019.
Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado Van Hasselt, and David Silver. Distributed prioritized experience replay. In 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, 2018.
Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy P. Lillicrap, and David Silver. Mastering atari, go, chess and shogi by planning with a learned model. ArXiv, abs/1911.08265, 2019.
Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, and Trevor Darrell. Curiosity-driven exploration by self-supervised prediction. In 34th International Conference on Machine Learning, ICML 2017,2017.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672-2680. Curran Associates, Inc., 2014.

前のページに戻る