HEXQ
An open problem in reinforcement learning is discovering hierarchical structure. HEXQ, an algorithm which automatically attempts to decompose and solve a model-free factored MDP hierarchically is described. By searching for aliased Markov sub-space regions based on the state variables the algorithm uses temporal and state abstraction to construct a hierarchy of interlinked smaller MDPs
HEXQ is a reinforcement learning algorithm that discovers hierarchical structure automatically. The generated task hierarchy represents the problem at different levels of abstraction. In this paper we extend HEXQ with heuristics that automatically approximate the structure of the task hierarchy. Construction, learning and execution time, as well as storage requirements of a task hierarchy may be significantly reduced and traded off against solution quality.