Human-Object Interaction (HOI) detection aims to localize human-object p...
Human object interaction (HOI) detection plays a crucial role in
human-c...
Breakthroughs in transformer-based models have revolutionized not only t...
We aim to tackle the problem of point-based interactive segmentation, in...
Self-supervised vision-and-language pretraining (VLP) aims to learn
tran...
In this paper, we present GEM as a General Evaluation benchmark for
Mult...
Visual grounding, which aims to build a correspondence between visual ob...
Few-shot semantic segmentation aims to learn to segment new object class...
Visual grounding is a ubiquitous building block in many vision-language ...
Reasoning human object interactions is a core problem in human-centric s...