Instance-aware detailed action labeling in videos

Hongtao Yang, Xuming He, Fatih Porikli

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1 Citation (Scopus)

    Abstract

    We address the problem of detailed sequence labeling of complex activities in videos, which aims to assign an action label to every frame. Previous work typically focus on predicting action class labels for each frame in a sequence without reasoning action instances. However, such category-level labeling is inefficient in encoding the global constraints at the action instance level and tends to produce inconsistent results. In this work we consider a fusion approach that exploits the synergy between action detection and sequence labeling for complex activities. To this end, we propose an instance-aware sequence labeling method that utilizes the cues from action instance detection. In particular, we design an LSTM-based fusion network that integrates framewise action labeling and action instance prediction to produce a final consistent labeling. To evaluate our method, we create a large-scale RGBD video dataset on gym activities for sequence labeling and action detection called GADD. The experimental results on GADD dataset show that our method outperforms all the state-of-the-art methods consistently in terms of labeling accuracy.

    Original languageEnglish
    Title of host publicationProceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages1577-1586
    Number of pages10
    ISBN (Electronic)9781538648865
    DOIs
    Publication statusPublished - 3 May 2018
    Event18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018 - Lake Tahoe, United States
    Duration: 12 Mar 201815 Mar 2018

    Publication series

    NameProceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
    Volume2018-January

    Conference

    Conference18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018
    Country/TerritoryUnited States
    CityLake Tahoe
    Period12/03/1815/03/18

    Fingerprint

    Dive into the research topics of 'Instance-aware detailed action labeling in videos'. Together they form a unique fingerprint.

    Cite this