We propose A-Crab (Actor-Critic Regularized by Average Bellman error), a...
Offline reinforcement learning (RL), which refers to decision-making fro...
In online reinforcement learning (RL), efficient exploration remains
par...
Offline (or batch) reinforcement learning (RL) algorithms seek to learn ...
We present an efficient and practical (polynomial time) algorithm for on...