Distributed privacy-preserving record linkage using pivot-based filter techniques

Marcel Gladbach, Ziad Sehili, Thomas Kudraß, Peter Christen, Erhard Rahm

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    5 Citations (Scopus)

    Abstract

    Privacy-preserving record linkage (PPRL) aims at linking person-related records from different data sources while protecting privacy. It is applied in medical research to link health data without revealing sensible person-related data. We propose and evaluate a new parallel PPRL approach based on Apache Flink that aims at high performance and scalability to large datasets. The approach supports a pivot-based filtering method for metric distance functions that saves many similarity computations. We describe our distributed approaches to determine pivots and pivot-based linkage. We also demonstrate the high efficiency of the approach for different datasets and configurations.

    Original languageEnglish
    Title of host publicationProceedings - IEEE 34th International Conference on Data Engineering Workshops, ICDEW 2018
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages33-38
    Number of pages6
    ISBN (Electronic)9781538663066
    DOIs
    Publication statusPublished - 2 Jul 2018
    Event34th IEEE International Conference on Data Engineering Workshops, ICDEW 2018 - Paris, France
    Duration: 16 Apr 201819 Apr 2018

    Publication series

    NameProceedings - IEEE 34th International Conference on Data Engineering Workshops, ICDEW 2018

    Conference

    Conference34th IEEE International Conference on Data Engineering Workshops, ICDEW 2018
    Country/TerritoryFrance
    CityParis
    Period16/04/1819/04/18

    Fingerprint

    Dive into the research topics of 'Distributed privacy-preserving record linkage using pivot-based filter techniques'. Together they form a unique fingerprint.

    Cite this