Whole-body pose estimation localizes the human body, hand, face, and foo...
This paper introduces DreamDiffusion, a novel method for generating
high...
We address the problem of photorealistic 3D face avatar synthesis from s...
Data-free meta-learning (DFML) aims to enable efficient learning of new ...
Existing wisdom demonstrates the significance of syntactic knowledge for...
Vision-language models have achieved tremendous progress far beyond what...
It has been commonly observed that a teacher model with superior perform...
Real-world data usually suffers from severe class imbalance and long-tai...
3D GAN inversion aims to achieve high reconstruction fidelity and reason...
The goal of data-free meta-learning is to learn useful prior knowledge f...
We present a simple yet effective method for skeleton-free motion
retarg...
Most existing image restoration methods use neural networks to learn str...
In the real world, data tends to follow long-tailed distributions w.r.t....
Real-world data contains a vast amount of multimodal information, among ...
Text-based style transfer is a newly-emerging research topic that uses t...
The recent GAN inversion methods have been able to successfully invert t...
Learning with noisy label (LNL) is a classic problem that has been
exten...
The real-world data tends to be heavily imbalanced and severely skew the...
High-fidelity facial avatar reconstruction from a monocular video is a
s...
Despite the remarkable success of foundation models, their task-specific...
The traditional model upgrading paradigm for retrieval requires recomput...
Instance-dependent label noise is realistic but rather challenging, wher...
Rendering high-resolution (HR) graphics brings substantial computational...
This paper proposes a novel video inpainting method. We make three main
...
Offline Reinforcement Learning has attracted much interest in solving th...
Matching-based methods, especially those based on space-time memory, are...
Image retrieval has become an increasingly appealing technique with broa...
Existing neural style transfer researches have studied to match statisti...
The task of privacy-preserving model upgrades in image retrieval desires...
The evaluation of 3D face reconstruction results typically relies on a r...
Conventional model upgrades for visual search systems require offline re...
The task of hot-refresh model upgrades of image retrieval systems plays ...
Deep hashing has shown promising performance in large-scale image retrie...
Exemplar-based colorization approaches rely on reference image to provid...
Scene text detection is still a challenging task, as there may be extrem...
Real-world data universally confronts a severe class-imbalance problem a...
Spatiotemporal predictive learning (ST-PL) aims at predicting the subseq...
Vessel tracing by modeling vascular structures in 3D medical images with...
Extracting variation and spatiotemporal features via limited frames rema...
Deep neural networks are often not robust to semantically-irrelevant cha...
For further progress in video object segmentation (VOS), larger, more
di...
The task of language-guided video temporal grounding is to localize the
...
Scene graph generation aims to produce structured representations for im...
Minute pixel changes in an image drastically change the prediction that ...
Image steganography refers to the process of hiding information inside
i...
Existing person video generation methods either lack the flexibility in
...
Many types of 3D acquisition sensors have emerged in recent years and po...
Image classification is a fundamental application in computer vision.
Re...
Correlation filter (CF) based tracking algorithms have demonstrated favo...
Visual tracking is one of the most important application areas of comput...