Creating a taxonomy of interests is expensive and human-effort intensive...
Recent advancements in vision-language pre-training (e.g. CLIP) have sho...
Visual Place Recognition is an essential component of systems for camera...
Text-video retrieval is an important multi-modal learning task, where th...
Knowledge distillation is an effective approach to learn compact models
...
Visual relocalization has been a widely discussed problem in 3D vision: ...
The ultimate aim of image restoration like denoising is to find an exact...
Recent high-performing Human-Object Interaction (HOI) detection techniqu...
Graph Neural Networks (GNNs) aim at integrating node contents with graph...
Cross-view geo-localization (CVGL), which aims to estimate the geographi...
Unsupervised domain adaption (UDA) aims to adapt models learned from a
w...
Recent non-local self-attention methods have proven to be effective in
c...
The non-local network has become a widely used technique for semantic
se...
Detecting out-of-distribution (OOD) data has become a critical component...
Domain generalizable model is attracting increasing attention in medical...
3D semantic scene completion and 2D semantic segmentation are two tightl...
Code completion is widely used by software developers to provide coding
...
Vision Transformers (ViT) have achieved remarkable success in large-scal...
Detecting out-of-distribution (OOD) inputs is a central challenge for sa...
In this paper, we present an efficient spatial-temporal representation f...
Radiotherapy is a treatment where radiation is used to eliminate cancer
...
Deep Neural Network (DNN) based super-resolution algorithms have greatly...
Camera localization aims to estimate 6 DoF camera poses from RGB images....
Augmented reality (AR) has gained increasingly attention from both resea...
Accurate localization is fundamental to a variety of applications, such ...
Two factors have proven to be very important to the performance of seman...
Nasopharyngeal Carcinoma (NPC) is a leading form of Head-and-Neck (HAN)
...
LiDAR point cloud analysis is a core task for 3D computer vision, especi...
The accuracy of deep convolutional neural networks (CNNs) generally impr...
Occlusion is still a severe problem in the video-based Re-IDentification...
Multi-organ segmentation has extensive applications in many clinical
app...
This paper aims to solve machine learning optimization problem by using
...
Detecting objects in 3D LiDAR data is a core technology for autonomous
d...
Pursuing realistic results according to human visual perception is the
c...
Unsupervised learning of depth and ego-motion from unlabelled monocular
...
Neural-networks based image restoration methods tend to use low-resoluti...
Previous work on cross-lingual sequence labeling tasks either requires
p...
In this paper, we propose an end-to-end deep neural network for solving ...
Visual place recognition is an important problem in both computer vision...
Visual tracking is fragile in some difficult scenarios, for instance,
ap...
Modern deep convolutional neural networks (CNNs) for image classificatio...
Designing a robust affinity model is the key issue in multiple target
tr...
We seek to automate the data capturing process in image-based modeling, ...
The aim of this study is to provide an automatic computational framework...
This paper aims at one newly raising task in vision and multimedia resea...