LIN, L. -J. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning. 1992, 8, 293-321
SUTTON, R. S. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proceedings of the Seventh International Conference on Machine Learning. 1990, 216-224
WIERING, M. HQ-learning. Adaptive Behavior. 1997, 6, 2, 219-246