Understanding which concepts models can and cannot represent has been
fu...
The field of vision and language has witnessed a proliferation of pre-tr...
Questions regarding implicitness, ambiguity and underspecification are
c...
We present the Pathways Autoregressive Text-to-Image (Parti) model, whic...
We study the problem of synthesizing immersive 3D indoor scenes from one...
We study the automatic generation of navigation instructions from 360-de...
Pretraining language models with next-token prediction on massive text
c...
Both image-caption pairs and translation pairs provide the means to lear...
People navigating in unfamiliar buildings take advantage of myriad visua...
Speech-based image retrieval has been studied as a proxy for joint
repre...
PanGEA, the Panoramic Graph Environment Annotation toolkit, is a lightwe...
Vision-and-Language Navigation wayfinding agents can be enhanced by
expl...
The output of text-to-image synthesis systems should be coherent, clear,...
Localized Narratives is a dataset with detailed natural language descrip...
We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigatio...
We present a multi-level geocoding model (MLG) that learns to associate ...
Training data for text classification is often limited in practice,
espe...
Image captioning datasets have proven useful for multimodal representati...
The Touchdown dataset (Chen et al., 2019) provides instructions by human...
Language is central to human intelligence. We review recent breakthrough...
We show that it is feasible to perform entity linking by training a dual...
Systems that can associate images with their spoken audio captions are a...
Most existing work on adversarial data generation focuses on English. Fo...
Vision-and-Language Navigation (VLN) tasks such as Room-to-Room (R2R) re...
In instruction conditioned navigation, agents interpret natural language...
Vision-and-Language Navigation (VLN) is a natural language grounding tas...
Advances in learning and representations have reinvigorated work that
co...
Existing paraphrase identification datasets lack sentence pairs that hav...
Paraphrasing is rooted in semantics. We show the effectiveness of
transf...
Coreference resolution is an important task for natural language
underst...
We address fine-grained multilingual language identification: providing ...
Split and rephrase is the task of breaking down a sentence into shorter ...
Unsupervised models of dependency parsing typically require large amount...
This paper tackles temporal resolution of documents, such as determining...