外乱存在下における探索過程の自動調整と保守的な制御入力に基づく安全強化学習

加藤佑介; 大川佳寛; 佐々木智丈; 屋並仁史; 滑川徹

文献

J-GLOBAL ID：202002286187854253 整理番号：20A2806090

外乱存在下における探索過程の自動調整と保守的な制御入力に基づく安全強化学習

Safe Reinforcement Learning Based on Automatic Exploration Process Adjustment with Conservative Control Input under Existence of Disturbance

出版者サイト {{ this.onShowPLink() }} 複写サービスで全文入手
高度な検索・分析はJDreamⅢで {{ this.onShowJLink("http://jdream3.com/lp/jglobal/index.html?docNo=20A2806090&from=J-GLOBAL&jstjournalNo=F0989D") }}

著者 (5件)： , , , ,
資料名：
巻： 63rd (Web) ページ： ROMBUNNO.2G1-4 (WEB ONLY) 発行年： 2020年
JST資料番号： F0989D 資料種別：会議録 (C)
記事区分：原著論文発行国：日本 (JPN) 言語：日本語 (JA)

・実環境でも学習できる安全強化学習に関して,外乱を考慮した上で,学習中の安全性を保証するための探索要素を含めない保守的制御入力に対する十分条件を提示。
・強化学習の探索過程を自動調整し,探索要素を含めない保守的制御入力を用いることで,状態制約を満たす確率を一定の閾値以上となることを証明。
・数値シミュレーションにおいて,提案手法を用いることで,学習中の安全性が保証されていることを確認。

, , , , , , ,
, ,

人工知能 , システム・制御理論一般

引用文献 (8件)：

R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction, MIT Press, 2nd edition (2018)
M. G. Bellemare, Y. Naddaf, J. Veness, M. Bowling: The Arcade Learning Environment: An Evaluation Platform for General Agents, Journal of Artificial Intelligence Research, 47, p.253-279(2013)
J. García and F. Fernández: A Comprehensive Survey on Safe Reinforcement Learning, Journal of Machine Learning Research, 16, pp.1437-1480(2015)
E. Biyik, J. Margoliash, S. R. Alimo and D. Sadigh: Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models, American Control Conference (ACC), pp.1792-1799 (2019)
Y. Ge, F. Zhu, X. Ling and Q. Liu: Safe QLearning Method Based on Constrained Markov Decision Processes, IEEE Access, 7, pp.165007-165017 (2019)

, , , , ,

前のページに戻る