One of the mainstream schemes for 2D human pose estimation (HPE) is lear...
Large-scale image-text contrastive pre-training models, such as CLIP, ha...
Existing methods of multi-person video 3D human Pose and Shape Estimatio...
Modeling dynamics in the form of partial differential equations (PDEs) i...
Modern deep learning-based 3D pose estimation approaches require plenty ...
Video 3D human pose estimation aims to localize the 3D coordinates of hu...
Masked image modeling (MIM) has achieved promising results on various vi...
Multi-person 3D pose estimation is a challenging task because of occlusi...
We study joint learning of Convolutional Neural Network (CNN) and Transf...
We propose Pixel-BERT to align image pixels with text by deep multi-moda...
Deep generative models are tremendously successful in learning
low-dimen...