Computer Science Faculty Publications

STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition

Tam Nguyen, University of DaytonFollow
Zheng Song, Visenze Pte.
Shuicheng Yan, National University of Singapore

Document Type

Article

Publication Date

1-2015

Publication Source

IEEE Transactions on Circuits and Systems for Video Technology

Abstract

Human action recognition is valuable for numerous practical applications, e.g., gaming, video surveillance, and video search. In this paper we hypothesize that the classification of actions can be boosted by designing a smart feature pooling strategy under the prevalently used bag-of-words-based representation. Founded on automatic video saliency analysis, we propose the spatial-temporal attention-aware pooling scheme for feature pooling. First, the video saliencies are predicted using the video saliency model, and the localized spatial-temporal features are pooled at different saliency levels and video-saliency-guided channels are formed. Saliency-aware matching kernels are thus derived as the similarity measurement of these channels. Intuitively, the proposed kernels calculate the similarities of the video foreground (salient areas) or background (nonsalient areas) at different levels. Finally, the kernels are fed into popular support vector machines for action classification. Extensive experiments on three popular data sets for action classification validate the effectiveness of our proposed method, which outperforms the state-of-the-art methods, namely 95.3% on UCF Sports (better by 4.0%), 87.9% on YouTube data set (better by 2.5%), and achieves comparable results on Hollywood2 dataset.

Inclusive pages

77-86

ISBN/ISSN

1051-8215

Comments

Permission documentation is on file.

Copyright

Publisher

IEEE

Volume

Peer Reviewed

yes

Issue

eCommons Citation

Nguyen, Tam; Song, Zheng; and Yan, Shuicheng, "STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition" (2015). Computer Science Faculty Publications. 78.
https://ecommons.udayton.edu/cps_fac_pub/78

Link to Full Text

COinS

Computer Science Faculty Publications

STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition

Document Type

Publication Date

Publication Source

Abstract

Inclusive pages

ISBN/ISSN

Comments

Copyright

Publisher

Volume

Peer Reviewed

Issue

eCommons Citation

ENTER SEARCH TERMS

Contribute Work

SelectedWorks

Browse

Contribute Work

Browse

Links

Computer Science Faculty Publications

STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition

Author(s)

Document Type

Publication Date

Publication Source

Abstract

Inclusive pages

ISBN/ISSN

Comments

Copyright

Publisher

Volume

Peer Reviewed

Issue

eCommons Citation

Share

ENTER SEARCH TERMS

Contribute Work

SelectedWorks

Browse

Contribute Work

Browse

Links