Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning

06/01/2018
by   Kavosh Asadi, et al.
0

Learning a generative model is a key component of model-based reinforcement learning. Though learning a good model in the tabular setting is a simple task, learning a useful model in the approximate setting is challenging. Recently Farahmand et al. (2017) proposed a value-aware (VAML) objective that captures the structure of value function during model learning. Using tools from Lipschitz continuity, we show that minimizing the VAML objective is in fact equivalent to minimizing the Wasserstein metric.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset