When humans read or listen, they make implicit commonsense inferences th...
The popularity of image sharing on social media and the engagement it cr...
We introduce the first dataset for sequential vision-to-language, and ex...
Representation and learning of commonsense knowledge is one of the
found...
There has been an explosion of work in the vision & language community d...
Integrating vision and language has long been a dream in work on artific...