JAAKKOLA, T. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems. Advances in Neural Information Processing Systems. 1994, 345-352
KIMURA, H. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward. Proc. 12th Int. Conf. on Machine Learning. 1995, 295-303
LITTMAN, M. L. Learning policies for partially observable environments : Scaling up. Proc. 12th Int. Conf. on Machine Learning. 1995, 362-370
McCALLUM, R. A. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State. Proc. 12th Int. Conf. on Machine Learning. 1995, 387-395