Instance segmentation in videos, which aims to segment and track multipl...
Video captioning is a challenging task as it needs to accurately transfo...
Video objection detection is a challenging task because isolated video f...
Geo-localization is a critical task in computer vision. In this work, we...
In this work, we introduce a Denser Feature Network (DenserNet) for visu...
Vision and voice are two vital keys for agents' interaction and learning...
Describing a video automatically with natural language is a challenging ...