有限遷移確率集合に対するロバスト強化学習

文献

J-GLOBAL ID：202202251803411060 整理番号：22A0619549

Robust Reinforcement Learning for Finite Transition Probability Set

出版者サイト {{ this.onShowPLink() }} 複写サービスで全文入手
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=22A0619549&from=J-GLOBAL&jstjournalNo=L2343B") }}

著者 (2件)： ,
資料名：
巻： 2021 ページ： ROMBUNNO.B1-3 発行年： 2022年01月07日
JST資料番号： L2343B 資料種別：会議録 (C)
記事区分：短報発行国：日本 (JPN) 言語：日本語 (JA)

・有限MDP(Markov decision process)の有限集合に対する確率的最短経路問題を定義。
・設定した問題に対する準最適方策の導出。
・数値例として迷路問題を検討することで,提案手法が有限MDPの有限集合に対するコストの上限を抑えることを提示。

, , , , , , ,
, ,

人工知能

引用文献 (8件)：

A. S. Polydoros and L. Nalpantidis: Survey of model-based reinforcement learning: applications on robotics. J. Intell. Robotics Syst., 86-2, 153/173 (2017)
R. Sutton and A. Barto: Reinforcement learning: an introduction, MIT Press (1998)
J. Morimoto and K. Doya: Robust reinforcement learning. Neural computation, 17-2, 335/359 (2005).
D. P. Bertsekas and S. E. Shreve: Stochastic optimal control: the discrete-time case, Athena Scientific (1996)
M. Duff: Design for an optimal probe, Proc. of the 19th Intl. Conf. on Machine Learning (ICML), 131/138 (2003)

, , ,

前のページに戻る