自然方策勾配法に基づくオフポリシー型強化学習法

中村泰; 石井信

文献

J-GLOBAL ID：200902273164932662 整理番号：05A0410485

自然方策勾配法に基づくオフポリシー型強化学習法

An off-policy reinforcement learning method based on a natural policy gradient method

出版者サイト複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=05A0410485&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=05A0410485&from=J-GLOBAL&jstjournalNo=S0532B") }}

著者 (2件)： ,
資料名：
巻： 104 号： 759(NC2004 169-192) ページ： 131-136 発行年： 2005年03月22日
JST資料番号： S0532B ISSN： 0913-5685 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

強化学習法には“探索搾取問題”と呼ばれる問題がある。これは,...

,...
,...

続きはJDreamIII（有料）にて {{ this.onShowAbsJLink("http://jdream3.com/lp/jglobal/index.html?docNo=05A0410485&from=J-GLOBAL&jstjournalNo=S0532B") }}

人工知能 , 数値計算

引用文献 (19件)：

SUTTON, R. S. Reinforcement Learning : An Introduction. 1998
ABERDEEN, D. A survey of approximate methods for solving partially observable markov decision processe. 2003
YOSHIMOTO, J. System identification based on on-line variational bayes method and its application to reinforcement learning. Artificial Neural Networks and Neural Information Processing. 2003, 123-131
WILLIAMS, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning. 1992, 8, 229-256
SUTTON, R. S. Policy gradient method for reinforcement learning with function approximation. Advances in Neural Information Processing Systems. 2000, 12, 1057-1063

前のページに戻る