Single Image Action Recognition Using Semantic Body Part Actions

Zhichen Zhao, Huimin Ma*, Shaodi You

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    70 Citations (Scopus)

    Abstract

    In this paper, we propose a novel single image action recognition algorithm based on the idea of semantic part actions. Unlike existing part-based methods, we argue that there exists a mid-level semantic, the semantic part action; and human action is a combination of semantic part actions and context cues. In detail, we divide human body into seven parts: head, torso, arms, hands and lower body. For each of them, we define a few semantic part actions (e.g. head: laughing). Finally, we exploit these part actions to infer the entire body action (e.g. applauding). To make the proposed idea practical, we propose a deep network-based framework which consists of two subnetworks, one for part localization and the other for action prediction. The action prediction network jointly learns part-level and body-level action semantics and combines them for the final decision. Extensive experiments demonstrate our proposal on semantic part actions as elements for entire body action. Our method reaches mAP of 93.9% and 91.2% on PASCAL VOC 2012 and Stanford-40, which outperforms the state-of-the-art by 2.3% and 8.6%.

    Original languageEnglish
    Title of host publicationProceedings - 2017 IEEE International Conference on Computer Vision, ICCV 2017
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages3411-3419
    Number of pages9
    ISBN (Electronic)9781538610329
    DOIs
    Publication statusPublished - 22 Dec 2017
    Event16th IEEE International Conference on Computer Vision, ICCV 2017 - Venice, Italy
    Duration: 22 Oct 201729 Oct 2017

    Publication series

    NameProceedings of the IEEE International Conference on Computer Vision
    Volume2017-October
    ISSN (Print)1550-5499

    Conference

    Conference16th IEEE International Conference on Computer Vision, ICCV 2017
    Country/TerritoryItaly
    CityVenice
    Period22/10/1729/10/17

    Fingerprint

    Dive into the research topics of 'Single Image Action Recognition Using Semantic Body Part Actions'. Together they form a unique fingerprint.

    Cite this