An Empirical Study and Comparison of Recent Few-Shot Object Detection Algorithms
The generic object detection (GOD) task has been successfully tackled by recent deep neural networks, trained by an avalanche of annotated training samples from some common classes. However, it is still non-trivial to generalize these object detectors to the novel long-tailed object classes, which has only few labeled training samples. To this end, the Few-Shot Object Detection (FSOD) has been topical recently, as it mimics the humans' ability of learning to learn, and intelligently transfers the learnt generic object knowledge from the common heavy-tailed, to the novel long-tailed object classes. Especially, the research in this emerging field has been flourish in the recent years with various benchmarks, backbones, and methodologies proposed. To review these FSOD works, there are several insightful FSOD survey articles that systematically study and compare them as the groups of fine-tuning/transfer learning, and meta-learning methods. In contrast, we compare these FSOD algorithms from the new perspective and taxonomy of their contributions, i.e., data-oriented, model-oriented, and algorithm oriented ones. Thus, an empirical study and comparison has been conducted on the recent achievements of FSOD. Furthermore, we also analyze the technical challenges, the merits and demerits of these methods, and envision the future directions of FSOD. Specifically, we give an overview of FSOD, including the problem definition, common datasets, and evaluation protocols. A new taxonomy is then proposed based on the role of prior knowledge during object detection of novel classes. Following this taxonomy, we provide a systematic review of the advances in FSOD. Finally, further discussions on performance, challenges, and future directions are presented.
READ FULL TEXT