In recent years, significant progress has been made in video instance
se...
World models, especially in autonomous driving, are trending and drawing...
Semantic segmentation in autonomous driving has been undergoing an evolu...
With the overwhelming trend of mask image modeling led by MAE, generativ...
Equipping embodied agents with commonsense is important for robots to
su...
In this paper, we propose an accurate data-free post-training quantizati...
In this paper, we propose a new detection framework for 3D small object
...
In this paper, we present a dense hybrid proposal modulation (DHPM) meth...
In this paper, we propose an ultrafast automated model compression frame...
Deep learning based fusion methods have been achieving promising perform...
In this paper, we propose binary sparse convolutional networks called BS...
Efficiently digitizing high-fidelity animatable human avatars from video...
3D scene understanding plays a vital role in vision-based autonomous dri...
Semantic occupancy perception is essential for autonomous driving, as
au...
Diffusion models (DMs) have become the new trend of generative models an...
Accurately estimating the shape of objects in dense clutters makes impor...
Modern methods for vision-centric autonomous driving perception widely a...
Diffusion probabilistic models (DPMs) have demonstrated a very promising...
In this paper, we present a new method that reformulates point cloud
com...
Talking head synthesis is a promising approach for the video production
...
Deep learning has revolutionized human society, yet the black-box nature...
With the continuously thriving popularity around the world, fitness acti...
With the rising industrial attention to 3D virtual modeling technology,
...
Text-video retrieval is an important multi-modal learning task, where th...
Object packing by autonomous robots is an im-portant challenge in wareho...
This paper proposes a probabilistic deep metric learning (PDML) framewor...
In this paper, we investigate the dynamics-aware adversarial attack prob...
Data mixing strategies (e.g., CutMix) have shown the ability to greatly
...
The pretrain-finetune paradigm in modern computer vision facilitates the...
3D object detection with surrounding cameras has been a promising direct...
Explaining deep convolutional neural networks has been recently drawing
...
Nowadays, pre-training big models on large-scale datasets has become a
c...
Recent progress in vision Transformers exhibits great success in various...
Grasping in dense clutter is a fundamental skill for autonomous robots.
...
Talking head synthesis is an emerging technology with wide applications ...
Objects are usually associated with multiple attributes, and these attri...
Rapid progress and superior performance have been achieved for skeleton-...
Different people age in different ways. Learning a personalized age esti...
In this paper, we present a new approach for model acceleration by explo...
In this paper, we propose a Shapley value based method to evaluate opera...
This paper presents a language-powered paradigm for ordinal regression.
...
Conventional point cloud semantic segmentation methods usually employ an...
In this paper, we present BEVerse, a unified framework for 3D perception...
This paper proposes an introspective deep metric learning (IDML) framewo...
Gait benchmarks empower the research community to train and evaluate
hig...
Face benchmarks empower the research community to train and evaluate
hig...
A bathtub in a library, a sink in an office, a bed in a laundry room – t...
Most existing action quality assessment methods rely on the deep feature...
Depth estimation from images serves as the fundamental step of 3D percep...
In this paper, we propose the LiDAR Distillation to bridge the domain ga...