自然TD学習:割引報酬におけるTD誤差を利用する自然方策勾配に基づいた強化学習法

森村哲郎; 内部英治; 銅谷賢治

文献

J-GLOBAL ID：200902248042684604 整理番号：05A0410486

自然TD学習:割引報酬におけるTD誤差を利用する自然方策勾配に基づいた強化学習法

Natural TD Learning: Efficient Use of TD-error for Natural Policy Gradient Reinforcement Learning with Discounted Rewards

出版者サイト複写サービスで全文入手 {{ this.onShowCLink("http://jdream3.com/copy/?sid=JGLOBAL&noSystem=1&documentNoArray=05A0410486&COPY=1") }}
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=05A0410486&from=J-GLOBAL&jstjournalNo=S0532B") }}

著者 (3件)： , ,
資料名：
巻： 104 号： 759(NC2004 169-192) ページ： 137-142 発行年： 2005年03月22日
JST資料番号： S0532B ISSN： 0913-5685 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

近年,環境との相互作用を通じて学習する手法である強化学習が注...

,...
,...

続きはJDreamIII（有料）にて {{ this.onShowAbsJLink("http://jdream3.com/lp/jglobal/index.html?docNo=05A0410486&from=J-GLOBAL&jstjournalNo=S0532B") }}

人工知能 , 数値計算

引用文献 (17件)：

UCHIBE, E. Competitive-cooperative-concurrent reinforcement learning with importance sampling. The 8th International Conference on the Simulation of Adaptive Behavior, 2004. 2004, 287-296
BAGNELL, D. Policy search by dynamic programming. Proceedings of Neural Information Processing Systems, 2004. 2004
RONSENSTEIN, M. T. Supervised actor-critic reinforcement learning. 2004, 359-380
AMARI, S. Natural gradient works efficiently in learning. Neural Computation. 1998, 10, 2, 251-276
AMARI, S. Differential and algebraic geometry of multilayer perceptrons. IEICE transactions on fundamentals of electronics, communications and computer sciences. 2001, 84, 1, 31-38

, , , , ,

前のページに戻る