We introduce an object-aware decoder for improving the performance of
sp...
Large-scale visual language models are widely used as pre-trained models...
The objective of this work is to learn an object-centric video
represent...
Our objective in this work is fine-grained classification of actions in
...
In this work, our objective is to address the problems of generalization...