To solve video-and-language grounding tasks, the key is for the network ...
We present a two-step hybrid reinforcement learning (RL) policy that is
...
Current conversational AI systems aim to understand a set of pre-designe...
The predominant approach to visual question answering (VQA) relies on
en...