Cross-modal pre-training has shown impressive performance on a wide rang...
Recently, large-scale diffusion models, e.g., Stable diffusion and DallE...
Recent advances in vision-language pre-training have enabled machines to...
Large vision and language models, such as Contrastive Language-Image
Pre...
Large-scale cross-modal pre-training paradigms have recently shown ubiqu...
Inspired by the success of visual-language methods (VLMs) in zero-shot
c...
Unsupervised large-scale vision-language pre-training has shown promisin...