Hashing-based distributed multi-party blocking for privacy-preserving record linkage

Thilina Ranbaduge*, Dinusha Vatsalan, Peter Christen, Vassilios Verykios

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    12 Citations (Scopus)

    Abstract

    In many application domains organizations require information from multiple sources to be integrated. Due to privacy and confidentiality concerns often these organizations are not willing or allowed to reveal their sensitive and personal data to other database owners, and to any external party. This has led to the emerging research discipline of privacy-preserving record linkage (PPRL). We propose a novel blocking approach for multi-party PPRL to efficiently and effectively prune the record sets that are unlikely to match. Our approach allows each database owner to perform blocking independently except for the initial agreement of parameter settings and a final central hashing-based clustering. We provide an analysis of our technique in terms of complexity, quality, and privacy, and conduct an empirical study with large datasets. The results show that our approach is scalable with the size of the datasets and the number of parties, while providing better quality and privacy than previous multi-party private blocking approaches.

    Original languageEnglish
    Title of host publicationAdvances in Knowledge Discovery and Data Mining - 20th Pacific-Asia Conference, PAKDD 2016, Proceedings
    EditorsJames Bailey, Latifur Khan, Takashi Washio, Gillian Dobbie, Joshua Zhexue Huang, Ruili Wang
    PublisherSpringer Verlag
    Pages415-427
    Number of pages13
    ISBN (Print)9783319317496
    DOIs
    Publication statusPublished - 2016
    Event20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016 - Auckland, New Zealand
    Duration: 19 Apr 201622 Apr 2016

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume9652 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference20th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2016
    Country/TerritoryNew Zealand
    CityAuckland
    Period19/04/1622/04/16

    Fingerprint

    Dive into the research topics of 'Hashing-based distributed multi-party blocking for privacy-preserving record linkage'. Together they form a unique fingerprint.

    Cite this