Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers

10/16/2022
by   Tao Tang, et al.
3

Automatic data augmentation (AutoAugment) strategies are indispensable in supervised data-efficient training protocols of vision transformers, and have led to state-of-the-art results in supervised learning. Despite the success, its development and application on self-supervised vision transformers have been hindered by several barriers, including the high search cost, the lack of supervision, and the unsuitable search space. In this work, we propose AutoView, a self-regularized adversarial AutoAugment method, to learn views for self-supervised vision transformers, by addressing the above barriers. First, we reduce the search cost of AutoView to nearly zero by learning views and network parameters simultaneously in a single forward-backward step, minimizing and maximizing the mutual information among different augmented views, respectively. Then, to avoid information collapse caused by the lack of label supervision, we propose a self-regularized loss term to guarantee the information propagation. Additionally, we present a curated augmentation policy search space for self-supervised learning, by modifying the generally used search space designed for supervised learning. On ImageNet, our AutoView achieves remarkable improvement over RandAug baseline (+10.2 and consistently outperforms sota manually tuned view policy by a clear margin (up to +1.3 pretraining also benefits downstream tasks (+1.2 Segmentation and +2.8 improves model robustness (+2.3 ImageNet-O). Code and models will be available at https://github.com/Trent-tangtao/AutoView.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset