A scalable privacy-preserving framework for temporal record linkage

Thilina Ranbaduge*, Peter Christen

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    2 Citations (Scopus)

    Abstract

    Record linkage (RL) is the process of identifying matching records from different databases that refer to the same entity. In many applications, it is common that the attribute values of records that belong to the same entity evolve over time, for example people can change their surname or address. Therefore, to identify the records that refer to the same entity over time, RL should make use of temporal information such as the time-stamp of when a record was created and/or update last. However, if RL needs to be conducted on information about people, due to privacy and confidentiality concerns organisations are often not willing or allowed to share sensitive data in their databases, such as personal medical records or location and financial details, with other organisations. This paper proposes a scalable framework for privacy-preserving temporal record linkage that can link different databases while ensuring the privacy of sensitive data in these databases. We propose two protocols that can be used in different linkage scenarios with and without a third party. Our protocols use Bloom filter encoding which incorporates the temporal information available in records during the linkage process. Our approaches first securely calculate the probabilities of entities changing attribute values in their records over a period of time. Based on these probabilities, we then generate a set of masking Bloom filters to adjust the similarities between record pairs. We provide a theoretical analysis of the complexity and privacy of our techniques and conduct an empirical study on large real databases containing several millions of records. The experimental results show that our approaches can achieve better linkage quality compared to non-temporal PPRL while providing privacy to individuals in the databases that are being linked.

    Original languageEnglish
    Pages (from-to)45-78
    Number of pages34
    JournalKnowledge and Information Systems
    Volume62
    Issue number1
    DOIs
    Publication statusPublished - 1 Jan 2020

    Fingerprint

    Dive into the research topics of 'A scalable privacy-preserving framework for temporal record linkage'. Together they form a unique fingerprint.

    Cite this