Motion Equivariance OF Event-based Camera Data with the Temporal Normalization Transform
In this work, we focus on using convolution neural networks (CNN) to perform object recognition on the event data. In object recognition, it is important for a neural network to be robust to the variations of the data during testing. For traditional cameras, translations are well handled because CNNs are naturally equivariant to translations. However, because event cameras record the change of light intensity of an image, the geometric shape of event volumes will not only depend on the objects but also on their relative motions with respect to the camera. The deformation of the events caused by motions causes the CNN to be less robust to unseen motions during inference. To address this problem, we would like to explore the equivariance property of CNNs, a well-studied area that demonstrates to produce predictable deformation of features under certain transformations of the input image.
READ FULL TEXT