A bag reconstruction method for multiple instance classification and group record linkage

Zhichun Fu*, Jun Zhou, Furong Peng, Peter Christen

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1 Citation (Scopus)

    Abstract

    Record linking is the task of detecting records in several databases that refer to the same entity. This task aims at exploring the relationship between entities, which normally lack common identifiers in heterogeneous datasets. When entities contain multiple relational records, linking them across datasets can be more accurate by treating the records as groups, which leads to group linking methods. Even so, individual record links may still be needed for the final group linking step. This problem can be solved by multiple instance learning, in which group links are modelled as bags, and record links are considered as instances. In this paper, we propose a novel method for instance classification and group record linkage via bag reconstruction from instances. The bag reconstruction is based on the modeling of the distribution of negative instances in the training bags via kernel density estimation. We evaluate this approach on both synthetic and real-world data. Our results show that the proposed method can outperform several baseline methods.

    Original languageEnglish
    Title of host publicationAdvanced Data Mining and Applications - 8th International Conference, ADMA 2012, Proceedings
    Pages247-259
    Number of pages13
    DOIs
    Publication statusPublished - 2012
    Event8th International Conference on Advanced Data Mining and Applications, ADMA 2012 - Nanjing, China
    Duration: 15 Dec 201218 Dec 2012

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume7713 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference8th International Conference on Advanced Data Mining and Applications, ADMA 2012
    Country/TerritoryChina
    CityNanjing
    Period15/12/1218/12/12

    Fingerprint

    Dive into the research topics of 'A bag reconstruction method for multiple instance classification and group record linkage'. Together they form a unique fingerprint.

    Cite this