Knowledge Guided Attention and Inference for Describing Images Containing Unseen Objects

Aditya Mogadala*, Umanga Bista, Lexing Xie, Achim Rettinger

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    2 Citations (Scopus)

    Abstract

    Images on the Web encapsulate diverse knowledge about varied abstract concepts. They cannot be sufficiently described with models learned from image-caption pairs that mention only a small number of visual object categories. In contrast, large-scale knowledge graphs contain many more concepts that can be detected by image recognition models. Hence, to assist description generation for those images which contain visual objects unseen in image-caption pairs, we propose a two-step process by leveraging large-scale knowledge graphs. In the first step, a multi-entity recognition model is built to annotate images with concepts not mentioned in any caption. In the second step, those annotations are leveraged as external semantic attention and constrained inference in the image description generation model. Evaluations show that our models outperform most of the prior work on out-of-domain MSCOCO image description generation and also scales better to broad domains with more unseen objects.

    Original languageEnglish
    Title of host publicationThe Semantic Web - 15th International Conference, ESWC 2018, Proceedings
    EditorsAldo Gangemi, Raphaël Troncy, Roberto Navigli, Laura Hollink, Maria-Esther Vidal, Pascal Hitzler, Anna Tordai, Mehwish Alam
    PublisherSpringer Verlag
    Pages415-429
    Number of pages15
    ISBN (Print)9783319934167
    DOIs
    Publication statusPublished - 2018
    Event15th International Conference on Extended Semantic Web Conference, ESWC 2018 - Heraklion, Greece
    Duration: 3 Jun 20187 Jun 2018

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10843 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference15th International Conference on Extended Semantic Web Conference, ESWC 2018
    Country/TerritoryGreece
    CityHeraklion
    Period3/06/187/06/18

    Fingerprint

    Dive into the research topics of 'Knowledge Guided Attention and Inference for Describing Images Containing Unseen Objects'. Together they form a unique fingerprint.

    Cite this