Recently, large-scale pre-trained language-image models like CLIP have s...
Visual tracking has made significant improvements in the past few decade...
Spatial convolutions are extensively used in numerous deep video models....
Parameter-efficient transfer learning (PETL) based on large-scale pre-tr...
Learning from changing tasks and sequential experience without forgettin...
Visual object tracking is an essential capability of intelligent robots....
The task of Human-Object Interaction (HOI) detection targets fine-graine...
Standard approaches for video recognition usually operate on the full in...
Temporal contexts among consecutive frames are far from being fully util...
Spatial convolutions are widely used in numerous deep video models. It
f...
Current approaches for video grounding propose kinds of complex architec...
The central idea of contrastive learning is to discriminate between diff...
Temporal action localization aims to localize starting and ending time w...
Weakly-Supervised Temporal Action Localization (WS-TAL) task aims to
rec...
This technical report presents our solution for temporal action detectio...
This paper presents our solution to the AVA-Kinetics Crossover Challenge...
This technical report analyzes an egocentric video action detection meth...
With the recent surge in the research of vision transformers, they have
...
Exploiting multi-scale features has shown great potential in tackling
se...
In this paper, we provide (i) a rigorous general theory to elicit condit...
Video-based human pose estimation in crowded scenes is a challenging pro...
Detecting and recognizing human action in videos with crowded scenes is ...
This paper presents our solution to ACM MM challenge: Large-scale
Human-...
In recent years, self-supervised methods for monocular depth estimation ...
Most existing trackers based on discriminative correlation filters (DCF)...
Correlation filter-based tracking has been widely applied in unmanned ae...
The outstanding computational efficiency of discriminative correlation f...
Due to implicitly introduced periodic shifting of limited searching area...
Traditional framework of discriminative correlation filters (DCF) is oft...