Multiclass semantic video segmentation with object-level active inference

Buyu Liu, Xuming He

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    53 Citations (Scopus)

    Abstract

    We address the problem of integrating object reasoning with supervoxel labeling in multiclass semantic video segmentation. To this end, we first propose an object-augmented dense CRF in spatio-temporal domain, which captures long-range dependency between supervoxels, and imposes consistency between object and supervoxel labels. We develop an efficient mean field inference algorithm to jointly infer the supervoxel labels, object activations and their occlusion relations for a moderate number of object hypotheses. To scale up our method, we adopt an active inference strategy to improve the efficiency, which adaptively selects object subgraphs in the object-augmented dense CRF. We formulate the problem as a Markov Decision Process, which learns an approximate optimal policy based on a reward of accuracy improvement and a set of well-designed model and input features. We evaluate our method on three publicly available multiclass video semantic segmentation datasets and demonstrate superior efficiency and accuracy.

    Original languageEnglish
    Title of host publicationIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
    PublisherIEEE Computer Society
    Pages4286-4294
    Number of pages9
    ISBN (Electronic)9781467369640
    DOIs
    Publication statusPublished - 14 Oct 2015
    EventIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 - Boston, United States
    Duration: 7 Jun 201512 Jun 2015

    Publication series

    NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
    Volume07-12-June-2015
    ISSN (Print)1063-6919

    Conference

    ConferenceIEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015
    Country/TerritoryUnited States
    CityBoston
    Period7/06/1512/06/15

    Fingerprint

    Dive into the research topics of 'Multiclass semantic video segmentation with object-level active inference'. Together they form a unique fingerprint.

    Cite this