As it is empirically observed that Vision Transformers (ViTs) are quite
...
Self-supervised learning (SSL) has made remarkable progress in visual
re...
Masked image modeling (MIM) has attracted much research attention due to...
In contrastive self-supervised learning, the common way to learn
discrim...