Scalable privacy-preserving record linkage for multiple databases

Dinusha Vatsalan, Peter Christen

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    40 Citations (Scopus)

    Abstract

    Privacy-preserving record linkage (PPRL) is the process of identifying records that correspond to the same real-world entities across several databases without revealing any sensitive information about these entities. Various techniques have been developed to tackle the problem of PPRL, with the majority of them only considering linking two databases. However, in many real-world applications data from more than two sources need to be linked. In this paper we consider the problem of linking data from three or more sources in an efficient and secure way. We propose a protocol that combines the use of Bloom filters, secure summation, and Dice coefficient similarity calculation with the aim to identify all records held by the different data sources that have a similarity above a certain threshold. Our protocol is secure in that no party learns any sensitive information about the other parties' data, but all parties learn which of their records have a high similarity with records held by the other parties. We evaluate our protocol on a large dataset showing the scalability, linkage quality, and privacy of our protocol.

    Original languageEnglish
    Title of host publicationCIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management
    PublisherAssociation for Computing Machinery
    Pages1795-1798
    Number of pages4
    ISBN (Electronic)9781450325981
    DOIs
    Publication statusPublished - 3 Nov 2014
    Event23rd ACM International Conference on Information and Knowledge Management, CIKM 2014 - Shanghai, China
    Duration: 3 Nov 20147 Nov 2014

    Publication series

    NameCIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management

    Conference

    Conference23rd ACM International Conference on Information and Knowledge Management, CIKM 2014
    Country/TerritoryChina
    CityShanghai
    Period3/11/147/11/14

    Fingerprint

    Dive into the research topics of 'Scalable privacy-preserving record linkage for multiple databases'. Together they form a unique fingerprint.

    Cite this