We propose and demonstrate a framework called perception as prediction f...
Learning good representations is a long standing problem in reinforcemen...
We propose a method to tackle the problem of mapless collision-avoidance...
We explore fixed-horizon temporal difference (TD) methods, reinforcement...
Importance sampling (IS) is a common reweighting strategy for off-policy...