Attention-based pyramid aggregation network for visual place recognition

Yingying Zhu, Lingxi Xie*, Jiong Wang, Liang Zheng

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    73 Citations (Scopus)

    Abstract

    Visual place recognition is challenging in the urban environment and is usually viewed as a large scale image retrieval task. The intrinsic challenges in place recognition exist that the confusing objects such as cars and trees frequently occur in the complex urban scene, and buildings with repetitive structures may cause over-counting and the burstiness problem degrading the image representations. To address these problems, we present an Attention-based Pyramid Aggregation Network (APANet), which is trained in an end-to-end manner for place recognition. One main component of APANet, the spatial pyramid pooling, can effectively encode the multi-size buildings containing geo-information. The other one, the attention block, is adopted as a region evaluator for suppressing the confusing regional features while highlighting the discriminative ones. When testing, we further propose a simple yet effective PCA power whitening strategy, which significantly improves the widely used PCA whitening by reasonably limiting the impact of over-counting. Experimental evaluations demonstrate that the proposed APANet outperforms the state-of-the-art methods on two place recognition benchmarks, and generalizes well on standard image retrieval datasets.

    Original languageEnglish
    Title of host publicationMM 2018 - Proceedings of the 2018 ACM Multimedia Conference
    PublisherAssociation for Computing Machinery, Inc
    Pages99-107
    Number of pages9
    ISBN (Electronic)9781450356657
    DOIs
    Publication statusPublished - 15 Oct 2018
    Event26th ACM Multimedia conference, MM 2018 - Seoul, Korea, Republic of
    Duration: 22 Oct 201826 Oct 2018

    Publication series

    NameMM 2018 - Proceedings of the 2018 ACM Multimedia Conference

    Conference

    Conference26th ACM Multimedia conference, MM 2018
    Country/TerritoryKorea, Republic of
    CitySeoul
    Period22/10/1826/10/18

    Fingerprint

    Dive into the research topics of 'Attention-based pyramid aggregation network for visual place recognition'. Together they form a unique fingerprint.

    Cite this