Abstract
Entity resolution is the process of determining whether a collection of entity representations refer to the same entity in the real world. In this paper we introduce a theoretical framework that supports knowledge-based entity resolution. From a logical point of view, the expressive power of the framework is equivalent to a decidable fragment of first-order logic including conjunction, disjunction and a certain form of negation. Although the framework is expressive for representing knowledge about entity resolution in a collective way, the questions that arise are: (1) how efficiently can knowledge patterns be processed; (2) how effectively can redundancy among knowledge patterns be eliminated. In answering these questions, we first study the evaluation problem for knowledge patterns. Our results show that this problem is NP-complete w.r.t. combined complexity but in ptime w.r.t. data complexity. This nice property leads us to investigate the containment problem for knowledge patterns, which turns out to be NP-complete. We further develop a notion of optimality for knowledge patterns and a mechanism of optimizing a knowledge model (i.e. a finite set of knowledge patterns). We prove that the optimality decision problem for knowledge patterns is still NP-complete.
Original language | English |
---|---|
Pages (from-to) | 101-126 |
Number of pages | 26 |
Journal | Theoretical Computer Science |
Volume | 549 |
Issue number | C |
DOIs | |
Publication status | Published - 2014 |