Temporal Action Labeling using Action Sets

06/02/2017
by   Alexander Richard, et al.
0

Action detection and temporal segmentation of actions in videos are topics of increasing interest. While fully supervised systems have gained much attention lately, full annotation of each action within the video is costly and impractical for large amounts of video data. Thus, weakly supervised action detection and temporal segmentation methods are of great importance. While most works in this area assume an ordered sequence of occurring actions to be given, our approach only uses a set of actions. Such action sets provide much less supervision since neither action ordering nor the number of action occurrences are known. In exchange, they can be easily obtained, for instance, from meta-tags, while ordered sequences still require human annotation. We introduce a system that automatically learns to temporally segment and label actions in a video, where the only supervision that is used are action sets. We evaluate our method on three datasets and show that it performs close to or on par with recent weakly supervised methods that require ordering constraints.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset