Mannat Singh

research

∙ 05/09/2023

ImageBind: One Embedding Space To Bind Them All

We present ImageBind, an approach to learn a joint embedding across six ...

0 Rohit Girdhar, et al. ∙

research

∙ 03/23/2023

The effectiveness of MAE pre-pretraining for billion-scale pretraining

This paper revisits the standard pretrain-then-finetune paradigm used in...

0 Mannat Singh, et al. ∙

research

∙ 06/16/2022

OmniMAE: Single Model Masked Pretraining on Images and Videos

Transformer-based architectures have become competitive across a variety...

11 Rohit Girdhar, et al. ∙

research

∙ 01/20/2022

Omnivore: A Single Model for Many Visual Modalities

Prior work has studied different visual modalities in isolation and deve...

7 Rohit Girdhar, et al. ∙

research

∙ 01/20/2022

Revisiting Weakly Supervised Pre-Training of Visual Perception Models

Model pre-training is a cornerstone of modern visual recognition systems...

0 Mannat Singh, et al. ∙

research

∙ 06/28/2021

Early Convolutions Help Transformers See Better

Vision transformer (ViT) models exhibit substandard optimizability. In p...

1 Tete Xiao, et al. ∙

research

∙ 04/26/2021

MDETR – Modulated Detection for End-to-End Multi-Modal Understanding

Multi-modal reasoning systems rely on a pre-trained object detector to e...

10 Aishwarya Kamath, et al. ∙

research

∙ 03/11/2021

Fast and Accurate Model Scaling

In this work we analyze strategies for convolutional neural network scal...

0 Piotr Dollár, et al. ∙

research

∙ 03/02/2021

Self-supervised Pretraining of Visual Features in the Wild

Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and S...

0 Priya Goyal, et al. ∙

Mannat Singh

Featured Co-authors

Sign in with Google

Consider DeepAI Pro