With ever increasing parameters and computation, vision-language pre-tra...
In this paper, we present a simple, flexible and effective vision-langua...
Given the long textual product information and the product image, Multi-...
Issue-commit links, as a type of software traceability links, play a vit...
The task of Human-Object Interaction (HOI) detection is to detect humans...
Face forgery techniques have advanced rapidly and pose serious security
...
In recent years, 3D representation learning has turned to 2D vision-lang...
Human-Object Interaction (HOI) detection aims to understand the interact...
Deepfakes are realistic face manipulations that can pose serious threats...
Deep neural networks often suffer from poor generalization due to comple...
Prompt tuning is a parameter-efficient way to deploy large-scale pre-tra...
Multimodal Large Language Model (MLLM) relies on the powerful LLM to per...
This paper presents a Spatial Re-parameterization (SpRe) method for the ...
Recently, deep learning-based facial landmark detection has achieved
sig...
Pre-trained language models (PLMs) have played an increasing role in
mul...
Token compression aims to speed up large-scale vision transformers (e.g....
Camouflaged Object Detection (COD) is a challenging task in computer vis...
Refactoring is an indispensable practice of improving the quality and
ma...
Recently, growing interest has been aroused in extending the multimodal
...
Arbitrary bit-width network quantization has received significant attent...
This paper introduces Distribution-Flexible Subset Quantization (DFSQ), ...
Interactive image segmentation enables annotators to efficiently perform...
Deep neural networks have been applied in many computer vision tasks and...
Text-driven 3D stylization is a complex and crucial task in the fields o...
Adversarial training can improve the robustness of neural networks. Prev...
In this paper, we propose YOSO, a real-time panoptic segmentation framew...
Occluded person re-identification (Re-ID) aims to address the potential
...
In this paper, we study teacher-student learning from the perspective of...
Semi-supervised object detection (SSOD) is a research hot spot in comput...
Parameter-efficient transfer learning (PETL) is an emerging research spo...
In this paper, we study the local visual modeling with grid features for...
We focus on addressing the dense backward propagation issue for training...
Visible-infrared person re-identification (VI-ReID) aims to match specif...
Cross-spectral person re-identification, which aims to associate identit...
Unsupervised domain adaptation person re-identification (Re-ID) aims to
...
Despite excellent performance in image generation, Generative Adversaria...
CutMix is a vital augmentation strategy that determines the performance ...
This paper proposes a content relationship distillation (CRD) to tackle ...
Most shadow removal methods rely on the invasion of training images
asso...
Nowadays, Multi-purpose Messaging Mobile App (MMMA) has become increasin...
Recent advances in 3D point cloud analysis bring a diverse set of networ...
Quantization-aware training (QAT) receives extensive popularity as it we...
Deep neural networks often suffer from poor generalization caused by com...
Modeling sparse and dense image matching within a unified functional
cor...
Masked autoencoders have become popular training paradigms for
self-supe...
This paper focuses on the limitations of current over-parameterized shad...
Visible-infrared person re-identification (VI-ReID) is a task of matchin...
GAN inversion aims to invert an input image into the latent space of a
p...
Although person re-identification has achieved an impressive improvement...
Building a universal video-language model for solving various video
unde...