JAAKKOLA, T. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems. Advances in Neural Information Processing Systems. 1994, 7, 345-352
MCCALLUM, R. A. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State. Proceedings of the 12th International Conference on Machine Learning, 1995. 1995, 387-395