ErGAN: Generative adversarial networks for entity resolution

Jingyu Shao, Qing Wang, Asiri Wijesinghe, Erhard Rahm

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    4 Citations (Scopus)

    Abstract

    Entity resolution targets at identifying records that represent the same real-world entity from one or more datasets. A major challenge in learning-based entity resolution is how to reduce the label cost for training. Due to the quadratic nature of record pair comparison, labeling is a costly task that often requires a significant effort from human experts. Inspired by recent advances of generative adversarial network (GAN), we propose a novel deep learning method, called ErGAN, to address the challenge. ErGAN consists of two key components: a label generator and a discriminator which are optimized alternatively through adversarial learning. To alleviate the issues of overfitting and highly imbalanced distribution, we design two novel modules for diversity and propagation, which can greatly improve the model generalization power. We have conducted extensive experiments to empirically verify the labeling and learning efficiency of ErGAN. The experimental results show that ErGAN beats the state-of-the-art baselines, including unsupervised, semi-supervised, and unsupervised learning methods.

    Original languageEnglish
    Title of host publicationProceedings - 20th IEEE International Conference on Data Mining, ICDM 2020
    EditorsClaudia Plant, Haixun Wang, Alfredo Cuzzocrea, Carlo Zaniolo, Xindong Wu
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages1250-1255
    Number of pages6
    ISBN (Electronic)9781728183169
    DOIs
    Publication statusPublished - Nov 2020
    Event20th IEEE International Conference on Data Mining, ICDM 2020 - Virtual, Sorrento, Italy
    Duration: 17 Nov 202020 Nov 2020

    Publication series

    NameProceedings - IEEE International Conference on Data Mining, ICDM
    Volume2020-November
    ISSN (Print)1550-4786

    Conference

    Conference20th IEEE International Conference on Data Mining, ICDM 2020
    Country/TerritoryItaly
    CityVirtual, Sorrento
    Period17/11/2020/11/20

    Fingerprint

    Dive into the research topics of 'ErGAN: Generative adversarial networks for entity resolution'. Together they form a unique fingerprint.

    Cite this