Learning to generate object segment proposals with multi-modal cues

Haoyang Zhang*, Xuming He, Fatih Porikli

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    This paper presents a learning-based object segmentation proposal generation method for stereo images. Unlike existing methods which mostly rely on low-level appearance cue and handcrafted similarity functions to group segments, our method makes use of learned deep features and designed geometric features to represent a region, as well as a learned similarity network to guide the grouping process. Given an initial segmentation hierarchy, we sequentially merge adjacent regions in each level based on their affinity measured by the similarity network. This merging process generates new segmentation hierarchies, which are then used to produce a pool of regional proposals by taking region singletons, pairs, triplets and 4-tuples from them. In addition, we learn a ranking network that predicts the objectness score of each regional proposal and diversify the ranking based on Maximum Marginal Relevance measures. Experiments on the Cityscapes dataset show that our approach performs significantly better than the baseline and the current state-of-the-art.

    Original languageEnglish
    Title of host publicationComputer Vision - ACCV 2016 - 13th Asian Conference on Computer Vision, Revised Selected Papers
    EditorsYoichi Sato, Ko Nishino, Vincent Lepetit, Shang-Hong Lai
    PublisherSpringer Verlag
    Pages121-136
    Number of pages16
    ISBN (Print)9783319541808
    DOIs
    Publication statusPublished - 2017
    Event13th Asian Conference on Computer Vision, ACCV 2016 - Taipei, Taiwan
    Duration: 20 Nov 201624 Nov 2016

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10111 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference13th Asian Conference on Computer Vision, ACCV 2016
    Country/TerritoryTaiwan
    City Taipei
    Period20/11/1624/11/16

    Fingerprint

    Dive into the research topics of 'Learning to generate object segment proposals with multi-modal cues'. Together they form a unique fingerprint.

    Cite this