Annotating 3D LiDAR point clouds for perception tasks including 3D objec...
We present ImageBind-LLM, a multi-modality instruction tuning method of ...
Referring image segmentation aims to segment the target object referred ...
Large language models (LLMs) have revolutionized natural language proces...
This paper investigates an under-explored but important problem: given a...
Recent advancements in Large Vision-Language Models (LVLMs) have demonst...
Instruction tuning large language model (LLM) on image-text pairs has
ac...
Text-guided image generation has witnessed unprecedented progress due to...
Large Vision-Language Models (LVLMs) have recently played a dominant rol...
Token compression aims to speed up large-scale vision transformers (e.g....
Controllable image denoising aims to generate clean samples with human
p...
This paper addresses an important problem of ranking the pre-trained dee...
Unsupervised contrastive learning for indoor-scene point clouds has achi...
Vision Transformer (ViT) and its variants (e.g., Swin, PVT) have achieve...
This work presents a probabilistic channel pruning method to accelerate
...
Convolutional Neural Networks (CNNs) are typically constructed by stacki...
Group convolution, which divides the channels of ConvNets into groups, h...
Group convolution, which divides the channels of ConvNets into groups, h...
Normalization methods improve both optimization and generalization of
Co...
Knowledge Distillation (KD) has been used in image classification for mo...
Batch Normalization (BN) makes output of hidden neuron had zero mean and...