To handle graphs in which features or connectivities are evolving over t...
3D scene understanding has gained significant attention due to its wide ...
In the field of autonomous driving, there have been many excellent perce...
3D visual grounding aims to localize the target object in a 3D point clo...
3D visual grounding involves finding a target object in a 3D scene that
...
An authentic face restoration system is becoming increasingly demanding ...
LLMs have demonstrated remarkable abilities at interacting with humans
t...
Environmental sustainability, driven by concerns about climate change,
r...
Due to limited resources on edge and different characteristics of deep n...
In this paper, we propose a novel language-guided 3D arbitrary neural st...
Scaling bottlenecks the making of digital quantum computers, posing
chal...
Multi-modal Contrastive Representation (MCR) learning aims to encode
dif...
The techniques for 3D indoor scene capturing are widely used, but the me...
We introduce a novel dataset consisting of images depicting pink eggs th...
Instant on-device Neural Radiance Fields (NeRFs) are in growing demand f...
Text image machine translation (TIMT) has been widely used in various
re...
Text image machine translation (TIMT) aims to translate texts embedded i...
As deep neural networks (DNNs) are being applied to a wide range of edge...
Many applications can benefit from personalized image generation models,...
Product Retrieval (PR) and Grounding (PG), aiming to seek image and
obje...
This paper proposes a method for generating images of customized objects...
Current methods for few-shot segmentation (FSSeg) have mainly focused on...
Numerous research efforts have been made to stabilize the training of th...
Retaining walls are often built to prevent excessive lateral movements o...
Backscatter Communication (BackCom) nodes harvest energy from and modula...
To balance the annotation labor and the granularity of supervision,
sing...
A common scenario of Multilingual Neural Machine Translation (MNMT) is t...
Neural Radiance Field (NeRF) based rendering has attracted growing atten...
Stereo images, containing left and right view images with disparity, are...
Current 3D object detection methods heavily rely on an enormous amount o...
Multiplication is arguably the most cost-dominant operation in modern de...
Vision Transformers (ViTs) have achieved state-of-the-art performance on...
In e-commerce industry, graph neural network methods are the new trends ...
Leveraging supervised information can lead to superior retrieval perform...
End-to-end text image translation (TIT), which aims at translating the s...
In this letter, we study the simultaneously transmitting and reflecting
...
In this paper, we introduce a new task, spoken video grounding (SVG), wh...
With the wide application of sparse ToF sensors in mobile devices, RGB
i...
Unsupervised domain adaptation (UDA) is an approach to minimizing domain...
The task of argument mining aims to detect all possible argumentative
co...
This work presents the first silicon-validated dedicated EGM-to-ECG (G2C...
Temporal action segmentation in videos has drawn much attention recently...
We present a first-of-its-kind ultra-compact intelligent camera system,
...
In recent days, streaming technology has greatly promoted the developmen...
Eye tracking has become an essential human-machine interaction modality ...
Detecting fraudulent transactions is an essential component to control r...
Simultaneous Localization and Mapping (SLAM) plays an important role in
...
At online retail platforms, detecting fraudulent accounts and transactio...
The performance of a semantic segmentation model for remote sensing (RS)...
We study the problem of constructing the control driving a controlled
di...