Applying Q-learning to high-dimensional or continuous action spaces can ...
It has been established that diverse behaviors spanning the controllable...
Learning to control an environment without hand-crafted rewards or exper...
We propose Scheduled Auxiliary Control (SAC-X), a new learning paradigm ...