Video question answering (VideoQA) is an essential task in vision-langua...
Video moment retrieval aims at finding the start and end timestamps of a...
The simultaneous recognition of multiple objects in one image remains a
...
We study the problem of weakly supervised grounded image captioning. Tha...
The Deep Neural Networks are vulnerable toadversarial exam-ples(Figure 1...
While self-supervised representation learning (SSL) has received widespr...
Text-based image retrieval has seen considerable progress in recent year...
Conventional semi-supervised learning (SSL) methods, e.g., MixMatch, ach...
Text-based person search aims at retrieving target person in an image ga...
Current training objectives of existing person Re-IDentification (ReID)
...
Pruning has become a very powerful and effective technique to compress a...
One significant factor we expect the video representation learning to
ca...
Although Person Re-Identification has made impressive progress, difficul...
In the conventional person Re-ID setting, it is widely assumed that crop...
Greedy-NMS inherently raises a dilemma, where a lower NMS threshold will...
This paper proposes an adaptive energy management strategy for hybrid
el...
Object detection has achieved remarkable progress in the past decade.
Ho...
With a fixed model structure, knowledge distillation and filter grafting...
Person re-identification (re-ID), is a challenging task due to the high
...
Although great progress in supervised person re-identification (Re-ID) h...
Recently, the research interest of person re-identification (ReID) has
g...
In this paper, we address the problem of monocular depth estimation when...
Although unsupervised person re-identification (RE-ID) has drawn increas...
Most existing Re-IDentification (Re-ID) methods are highly dependent on
...
With the rapid increase of transnational communication and cooperation,
...