Pre-training with offline data and online fine-tuning using reinforcemen...
Maximum entropy (MaxEnt) RL maximizes a combination of the original task...
We consider the safe reinforcement learning (RL) problem of maximizing
u...
Standard model-free reinforcement learning algorithms optimize a policy ...
We propose temporally abstract soft actor-critic (TASAC), an off-policy ...
In robot sensing scenarios, instead of passively utilizing human capture...
We propose a hierarchical reinforcement learning method, HIDIO, that can...
In this document we describe a rationale for a research program aimed at...
This paper describes an implementation of a bot assistant in Minecraft, ...
In this paper, we propose Efficient Progressive Neural Architecture Sear...
The success of lottery ticket initializations (Frankle and Carbin, 2019)...
The lottery ticket hypothesis proposes that over-parameterization of dee...
Neural Architecture Search (NAS) is a laborious process. Prior work on
a...
Recently there has been a rising interest in training agents, embodied i...
Building intelligent agents that can communicate with and learn from hum...
We build a virtual agent for learning language in a 2D maze-like world. ...
One of the long-term goals of artificial intelligence is to build an age...
We tackle a task where an agent learns to navigate in a 2D maze-like
env...
We present a unified framework which supports grounding natural-language...
We tackle the problem of video object codetection by leveraging the weak...
Prior work presented the sentence tracker, a method for scoring how well...
We present a method for learning word meanings from complex and realisti...