Observing the close relationship among panoptic, semantic and instance
s...
Contrastively trained text-image models have the remarkable ability to
p...
Detecting actions in untrimmed videos should not be limited to a small,
...
We present F-VLM, a simple open-vocabulary object detection method built...
We design an open-vocabulary image segmentation model to organize an ima...
Zero-shot image classification has made promising progress by training t...
Cameras are prevalent in our daily lives, and enable many useful systems...
We present a novel deep neural network architecture for end-to-end scene...
There is a growing trend in studying deep hashing methods for content-ba...