Benchmarking Representation Learning for Natural World Image Collections

by   Grant Van Horn, et al.

Recent progress in self-supervised learning has resulted in models that are capable of extracting rich representations from image collections without requiring any explicit label supervision. However, to date the vast majority of these approaches have restricted themselves to training on standard benchmark datasets such as ImageNet. We argue that fine-grained visual categorization problems, such as plant and animal species classification, provide an informative testbed for self-supervised learning. In order to facilitate progress in this area we present two new natural world visual classification datasets, iNat2021 and NeWT. The former consists of 2.7M images from 10k different species uploaded by users of the citizen science application iNaturalist. We designed the latter, NeWT, in collaboration with domain experts with the aim of benchmarking the performance of representation learning algorithms on a suite of challenging natural world binary classification tasks that go beyond standard species classification. These two new datasets allow us to explore questions related to large-scale representation and transfer learning in the context of fine-grained categories. We provide a comprehensive analysis of feature extractors trained with and without supervision on ImageNet and iNat2021, shedding light on the strengths and weaknesses of different learned features across a diverse set of tasks. We find that features produced by standard supervised methods still outperform those produced by self-supervised approaches such as SimCLR. However, improved self-supervised learning methods are constantly being released and the iNat2021 and NeWT datasets are a valuable resource for tracking their progress.


page 5

page 8


When Does Contrastive Visual Representation Learning Work?

Recent self-supervised representation learning techniques have largely c...

Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring

We address the problem of learning self-supervised representations from ...

Mugs: A Multi-Granular Self-Supervised Learning Framework

In self-supervised learning, multi-granular features are heavily desired...

Improvements to context based self-supervised learning

We develop a set of methods to improve on the results of self-supervised...

On the Origin of Species of Self-Supervised Learning

In the quiet backwaters of cs.CV, cs.LG and stat.ML, a cornucopia of new...

Simple Control Baselines for Evaluating Transfer Learning

Transfer learning has witnessed remarkable progress in recent years, for...

Concept Generalization in Visual Representation Learning

Measuring concept generalization, i.e., the extent to which models train...

Please sign up or login with your details

Forgot password? Click here to reset