World Programs for Model-Based Learning and Planning in Compositional State and Action Spaces

12/30/2019
by   Marwin H. S. Segler, et al.
21

Some of the most important tasks take place in environments which lack cheap and perfect simulators, thus hampering the application of model-free reinforcement learning (RL). While model-based RL aims to learn a dynamics model, in a more general case the learner does not know a priori what the action space is. Here we propose a formalism where the learner induces a world program by learning a dynamics model and the actions in graph-based compositional environments by observing state-state transition examples. Then, the learner can perform RL with the world program as the simulator for complex planning tasks. We highlight a recent application, and propose a challenge for the community to assess world program-based planning.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset