A theoretical framework for knowledge-based entity resolution

Klaus Dieter Schewe, Qing Wang*

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    7 Citations (Scopus)

    Abstract

    Entity resolution is the process of determining whether a collection of entity representations refer to the same entity in the real world. In this paper we introduce a theoretical framework that supports knowledge-based entity resolution. From a logical point of view, the expressive power of the framework is equivalent to a decidable fragment of first-order logic including conjunction, disjunction and a certain form of negation. Although the framework is expressive for representing knowledge about entity resolution in a collective way, the questions that arise are: (1) how efficiently can knowledge patterns be processed; (2) how effectively can redundancy among knowledge patterns be eliminated. In answering these questions, we first study the evaluation problem for knowledge patterns. Our results show that this problem is NP-complete w.r.t. combined complexity but in ptime w.r.t. data complexity. This nice property leads us to investigate the containment problem for knowledge patterns, which turns out to be NP-complete. We further develop a notion of optimality for knowledge patterns and a mechanism of optimizing a knowledge model (i.e. a finite set of knowledge patterns). We prove that the optimality decision problem for knowledge patterns is still NP-complete.

    Original languageEnglish
    Pages (from-to)101-126
    Number of pages26
    JournalTheoretical Computer Science
    Volume549
    Issue numberC
    DOIs
    Publication statusPublished - 2014

    Fingerprint

    Dive into the research topics of 'A theoretical framework for knowledge-based entity resolution'. Together they form a unique fingerprint.

    Cite this