Public large-scale text-to-image diffusion models, such as Stable Diffus...
In this paper, we present a novel Distribution-Aware Single-stage (DAS) ...
Recent state-of-the-art one-stage instance segmentation model SOLO divid...
Self-attention has become an integral component of the recent network
ar...
3D object detection with LiDAR point clouds plays an important role in
a...
Inspired by the success of BERT, several multimodal representation learn...
Recently, DETR pioneered the solution of vision tasks with transformers,...
Continual learning tackles the setting of learning different tasks
seque...
Transformers, which are popular for language modeling, have been explore...
Knowledge Distillation (KD) refers to transferring knowledge from a larg...
Video-based human pose estimation in crowded scenes is a challenging pro...
Detecting and recognizing human action in videos with crowded scenes is ...
This paper presents our solution to ACM MM challenge: Large-scale
Human-...
Pre-trained language models like BERT and its variants have recently ach...
The inverted residual block is dominating architecture design for mobile...
Previous adversarial domain alignment methods for unsupervised domain
ad...
Salient object detection models often demand a considerable amount of
co...
The searching procedure of neural architecture search (NAS) is notorious...
Neural Architecture Search (NAS) that aims to automate the procedure of
...
We propose a new building block, IdleBlock, which naturally prunes
conne...
In natural images, information is conveyed at different frequencies wher...
Features from multiple scales can greatly benefit the semantic edge dete...
To detect and segment salient objects accurately, existing methods are
u...
Globally modeling and reasoning over relations between regions can be
be...
Learning to capture long-range relations is fundamental to image/video
r...
In this paper, we aim to reduce the computational cost of spatio-tempora...
Aggregating context information from multiple scales has been proved to ...
The ability of predicting the future is important for intelligent system...
An intuition on human segmentation is that when a human is moving in a v...
In this work, we present a simple, highly efficient and modularized Dual...
Learning rich and diverse representations is critical for the performanc...
In this work, we address the challenging video scene parsing problem by
...
In this paper, we consider the scene parsing problem and propose a novel...
Recently, significant improvement has been made on semantic object
segme...