Despite advances in Visual Question Answering (VQA), the ability of mode...
Visual and linguistic concepts naturally organize themselves in a hierar...
Humans can learn and reason under substantial uncertainty in a space of
...
We address the question of characterizing and finding optimal representa...
We present a hierarchical reinforcement learning (HRL) or options framew...
We propose a new class of probabilistic neural-symbolic models, that hav...
It is easy for people to imagine what a man with pink hair looks like, e...
To be able to interact better with humans, it is crucial for machines to...
We introduce an inference technique to produce discriminative context-aw...
We propose a technique for making Convolutional Neural Network (CNN)-bas...
We propose a technique for producing "visual explanations" for decisions...
We are interested in counting the number of instances of object classes ...
We propose a model to learn visually grounded word embeddings (vis-w2v) ...
In this paper we describe the Microsoft COCO Caption dataset and evaluat...
Automatically describing an image with a sentence is a long-standing
cha...
We describe our two new datasets with images described by humans. Both t...