Adversarial robustness poses a critical challenge in the deployment of d...
Causal Video Question Answering (CVidQA) queries not only association or...
The large amount of data collected by LiDAR sensors brings the issue of ...
While recent large-scale video-language pre-training made great progress...
Monocular 3D object detection is an important yet challenging task in
au...
In few-shot imitation learning (FSIL), using behavioral cloning (BC) to ...
Anomaly awareness is an essential capability for safety-critical applica...
Spatial-temporal prediction is a critical problem for intelligent
transp...
Understanding and comprehending video content is crucial for many real-w...
Despite the success of deep learning on supervised point cloud semantic
...
Most of the 3D networks are trained from scratch owning to the lack of
l...
To effectively apply robots in working environments and assist humans, i...
Dense Depth estimation plays a key role in multiple applications such as...
While recent progress has significantly boosted few-shot classification ...
The human ability of deep cognitive skills are crucial for the developme...
We study a novel task, Video Question-Answer Generation (VQAG), for
chal...
We proposed an end-to-end grasp detection network, Grasp Detection Netwo...
Video Question Answering (Video QA) is a critical and challenging task i...