Visual and linguistic concepts naturally organize themselves in a hierar...
Although an object may appear in numerous contexts, we often describe it...
Large datasets of paired images and text have become increasingly popula...
Recent advances in self-supervised learning (SSL) have largely closed th...
The de-facto approach to many vision tasks is to start from pretrained v...
High-dimensional always-changing environments constitute a hard challeng...
We propose a new class of probabilistic neural-symbolic models, that hav...
Image captioning models have achieved impressive results on datasets
con...