Learning Object Relation Graph and Tentative Policy for Visual Navigation

Heming Du, Xin Yu*, Liang Zheng

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    67 Citations (Scopus)

    Abstract

    Target-driven visual navigation aims at navigating an agent towards a given target based on the observation of the agent. In this task, it is critical to learn informative visual representation and robust navigation policy. Aiming to improve these two components, this paper proposes three complementary techniques, object relation graph (ORG), trial-driven imitation learning (IL), and a memory-augmented tentative policy network (TPN). ORG improves visual representation learning by integrating object relationships, including category closeness and spatial correlations, e.g., a TV usually co-occurs with a remote spatially. Both Trial-driven IL and TPN underlie robust navigation policy, instructing the agent to escape from deadlock states, such as looping or being stuck. Specifically, trial-driven IL is a type of supervision used in policy network training, while TPN, mimicking the IL supervision in unseen environment, is applied in testing. Experiment in the artificial environment AI2-Thor validates that each of the techniques is effective. When combined, the techniques bring significantly improvement over baseline methods in navigation effectiveness and efficiency in unseen environments. We report 22.8% and 23.5% increase in success rate and Success weighted by Path Length (SPL), respectively. The code is available at https://github.com/xiaobaishu0097/ECCV-VN.git.

    Original languageEnglish
    Title of host publicationComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
    EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages19-34
    Number of pages16
    ISBN (Print)9783030585709
    DOIs
    Publication statusPublished - 2020
    Event16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom
    Duration: 23 Aug 202028 Aug 2020

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume12352 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference16th European Conference on Computer Vision, ECCV 2020
    Country/TerritoryUnited Kingdom
    CityGlasgow
    Period23/08/2028/08/20

    Fingerprint

    Dive into the research topics of 'Learning Object Relation Graph and Tentative Policy for Visual Navigation'. Together they form a unique fingerprint.

    Cite this