The AI community has made significant strides in developing powerful
fou...
Magnetic resonance imaging (MRI) have played a crucial role in brain dis...
Not all camouflages are equally effective, as even a partially visible
c...
Although we have witnessed significant progress in human-object interact...
Multi-person motion prediction is a challenging problem due to the depen...
In this study, we aim to initiate the development of Radiology Foundatio...
Video frame interpolation (VFI) is a challenging task that aims to gener...
Without accurate transcription of numerical data in scientific documents...
In this paper, we consider the problem of composed image retrieval (CIR)...
The goal of this paper is open-vocabulary object detection (OVOD)
x2013 ...
Generative models have recently exhibited exceptional capabilities in va...
The objective of Audio-Visual Segmentation (AVS) is to localise the soun...
In this paper, we focus on the problem of Medical Visual Question Answer...
Large Language Models (LLMs) have showcased remarkable capabilities in
n...
Segmentation is a core computer vision competency, with applications spa...
Video Instance Segmentation(VIS) aims at segmenting and categorizing obj...
The objective of this paper is an automatic Audio Description (AD) model...
Camera-only 3D detection provides an economical solution with a simple
c...
In this paper, we consider the problem of temporal action localization u...
Foundation models trained on large-scale dataset gain a recent surge in ...
Despite of the success of multi-modal foundation models pre-trained on
l...
In this paper, we consider the problem of disease diagnosis. Unlike the
...
In this paper, we consider the problem of simultaneously detecting objec...
In this paper, we consider the problem of open-vocabulary semantic
segme...
The goal of this paper is to augment a pre-trained text-to-image diffusi...
In this paper, we consider the problem of enhancing self-supervised
visu...
When trained at a sufficient scale, self-supervised learning has exhibit...
Detecting occluded objects still remains a challenge for state-of-the-ar...
The objective of this paper is audio-visual synchronisation of general v...
The objective of this paper is an efficient training method for video ta...
Existing models on super-resolution often specialized for one scale,
fun...
In this paper, we consider the task of unsupervised object discovery in
...
The goal of this work is to segment and name regions of images without a...
Two-dimensional (2D) freehand ultrasound is the mainstay in prenatal car...
In this paper, we consider the problem of generalised visual object coun...
The goal of this paper is to interactively refine the automatic segmenta...
Drones equipped with cameras can significantly enhance human ability to
...
The objective of this paper is a model that is able to discover, track a...
We present a simple yet effective self-supervised framework for audio-vi...
Semantic segmentation has a broad range of applications, but its real-wo...
This paper considers the problem of fast MRI reconstruction. We propose ...
The objective of this paper is a temporal alignment network that ingests...
In this paper, we tackle the challenging task of unsupervised salient ob...
The objective of this paper is few-shot object detection (FSOD) – the ta...
Visual-language pre-training has shown great success for learning joint
...
In this paper, we consider the problem of audio-visual synchronisation
a...
In this paper, we present a framework for reading analog clocks in natur...
The objective of this work is to achieve sensorless reconstruction of a ...
In this paper, we propose a self-supervised approach for tumor segmentat...
The objective of this work is to segment any arbitrary structures of int...