Pseudo-labels for Supervised Learning on Dynamic Vision Sensor Data, Applied to Object Detection under Ego-motion
In recent years, dynamic vision sensors (DVS), also known as event-based cameras or neuromorphic sensors, have seen increased use due to various advantages over traditional frame-based cameras. Its high temporal resolution overcomes motion blurring, its high dynamic range overcomes extreme illumination conditions and its low power consumption makes it suitable for platforms such as drones and self-driving cars. While frame-based computer vision is mature due to the large amounts of data and ground truth available, event-based computer vision is still a work in progress as data sets are not as aplenty and ground truth is scarce for tasks such as object detection. In this paper, we suggest a way to overcome the lack of labeled ground truth by introducing a simple method to generate pseudo-labels for dynamic vision sensor data, assuming that the corresponding frame-based data is also available. These pseudo-labels can be treated as ground truth when training on supervised learning algorithms, and we show, for the first time, event-based car detection under ego-motion in a realistic environment at 100 frames per second with an average precision of 41.7
READ FULL TEXT