A Graph Matching Attack on Privacy-Preserving Record Linkage

Anushka Vidanage, Peter Christen, Thilina Ranbaduge, Rainer Schnell

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    18 Citations (Scopus)

    Abstract

    To facilitate advanced analytics, data science projects increasingly require records about individuals to be linked across databases. Generally no unique entity identifiers are available in the databases to be linked, and therefore quasi-identifiers such as names, addresses, and dates of birth are used to link records. The process of linking records without revealing any sensitive or confidential information about the entities represented by these records is known as privacy-preserving record linkage (PPRL). Various encoding and encryption based PPRL methods have been developed in the past two decades. Most existing PPRL methods calculate approximate similarities between records because errors and variations can occur in quasi-identifying attribute values. Even though being used in real-world linkage applications, certain PPRL methods, such as popular Bloom filter encoding, have shown to be vulnerable to cryptanalysis attacks. In this paper we present a novel attack on PPRL methods that exploits the approximate similarities calculated between encoded records. Our attack matches nodes in a similarity graph generated from an encoded database with a corresponding similarity graph generated from a plain-text database to re-identify sensitive values. Our attack is not limited to any specific PPRL method, and in an experimental evaluation we apply it on three PPRL encoding methods using three different databases. This evaluation shows that our attack can successfully re-identify sensitive values from these encodings with high accuracy where no previous attack on PPRL would have been successful.

    Original languageEnglish
    Title of host publicationCIKM 2020 - Proceedings of the 29th ACM International Conference on Information and Knowledge Management
    PublisherAssociation for Computing Machinery
    Pages1485-1494
    Number of pages10
    ISBN (Electronic)9781450368599
    DOIs
    Publication statusPublished - 19 Oct 2020
    Event29th ACM International Conference on Information and Knowledge Management, CIKM 2020 - Virtual, Online, Ireland
    Duration: 19 Oct 202023 Oct 2020

    Publication series

    NameInternational Conference on Information and Knowledge Management, Proceedings

    Conference

    Conference29th ACM International Conference on Information and Knowledge Management, CIKM 2020
    Country/TerritoryIreland
    CityVirtual, Online
    Period19/10/2023/10/20

    Fingerprint

    Dive into the research topics of 'A Graph Matching Attack on Privacy-Preserving Record Linkage'. Together they form a unique fingerprint.

    Cite this