Scale Invariant Solutions for Overdetermined Linear Systems with Applications to Reinforcement Learning

04/15/2021
by   Rahul Madhavan, et al.
0

Overdetermined linear systems are common in reinforcement learning, e.g., in Q and value function estimation with function approximation. The standard least-squares criterion, however, leads to a solution that is unduly influenced by rows with large norms. This is a serious issue, especially when the matrices in these systems are beyond user control. To address this, we propose a scale-invariant criterion that we then use to develop two novel algorithms for value function estimation: Normalized Monte Carlo and Normalized TD(0). Separately, we also introduce a novel adaptive stepsize that may be useful beyond this work as well. We use simulations and theoretical guarantees to demonstrate the efficacy of our ideas.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset