Loading…

Early Action Recognition With Category Exclusion Using Policy-Based Reinforcement Learning

The goal of early action recognition is to predict action label when the sequence is partially observed. The existing methods treat the early action recognition task as sequential classification problems on different observation ratios of an action sequence. Since these models are trained by differe...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on circuits and systems for video technology 2020-12, Vol.30 (12), p.4626-4638
Main Authors:	Weng, Junwu, Jiang, Xudong, Zheng, Wei-Long, Yuan, Junsong
Format:	Article
Language:	English
Subjects:	Accuracy Category exclusion Classification Datasets early action recognition Feature extraction Learning Learning (artificial intelligence) Machine learning policy-based reinforcement learning Recognition Reinforcement learning Task analysis Three-dimensional displays Visualization
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The goal of early action recognition is to predict action label when the sequence is partially observed. The existing methods treat the early action recognition task as sequential classification problems on different observation ratios of an action sequence. Since these models are trained by differentiating positive category from all negative classes, the diverse information of different negative categories is ignored, which we believe can be collected to help improve the recognition performance. In this paper, we step towards to a new direction by introducing category exclusion to early action recognition. We model the exclusion as a mask operation on the classification probability output of a pre-trained early action recognition classifier. Specifically, we use policy-based reinforcement learning to train an agent. The agent generates a series of binary masks to exclude interfering negative categories during action execution and hence help improve the recognition accuracy. The proposed method is evaluated on three benchmark recognition datasets, NTU-RGBD, First-Person Hand Action, as well as UCF-101. The proposed method enhances the recognition accuracy consistently over all different observation ratios on the three datasets, where the accuracy improvements on the early stages are especially significant.
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2020.2976789