Activity Detection with Latent Sub-event Hierarchy Learning

03/16/2018
by   AJ Piergiovanni, et al.
0

In this paper, we introduce a new convolutional layer named the Temporal Gaussian Mixture (TGM) layer and present how it can be used to efficiently capture temporal structure in continuous activity videos. Our layer is designed to allow the model to learn a latent hierarchy of sub-event intervals. Our approach is fully differentiable while relying on a significantly less number of parameters, enabling its end-to-end training with standard backpropagation. We present our convolutional video models with multiple TGM layers for activity detection. Our experiments on multiple datasets including Charades and MultiTHUMOS confirm the benefit of our TGM layers, illustrating that it outperforms other models and temporal convolutions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset