Pattern-mining based cryptanalysis of bloom filters for privacy-preserving record linkage

Peter Christen*, Anushka Vidanage, Thilina Ranbaduge, Rainer Schnell

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    19 Citations (Scopus)

    Abstract

    Data mining projects increasingly require records about individuals to be linked across databases to facilitate advanced analytics. The process of linking records without revealing any sensitive or confidential information about the entities represented by these records is known as privacy-preserving record linkage (PPRL). Bloom filters are a popular PPRL technique to encode sensitive information while still enabling approximate linking of records. However, Bloom filter encoding can be vulnerable to attacks that can re-identify some encoded values from sets of Bloom filters. Existing attacks exploit that certain Bloom filters can occur frequently in an encoded database, and thus likely correspond to frequent plain-text values such as common names. We present a novel attack method based on a maximal frequent itemset mining technique which identifies frequently co-occurring bit positions in a set of Bloom filters. Our attack can re-identify encoded sensitive values even when all Bloom filters in an encoded database are unique. As our experiments on a real-world data set show, our attack can successfully re-identify values from encoded Bloom filters even in scenarios where previous attacks fail.

    Original languageEnglish
    Title of host publicationAdvances in Knowledge Discovery and Data Mining - 22nd Pacific-Asia Conference, PAKDD 2018, Proceedings
    EditorsGeoffrey I. Webb, Dinh Phung, Mohadeseh Ganji, Lida Rashidi, Vincent S. Tseng, Bao Ho
    PublisherSpringer Verlag
    Pages530-542
    Number of pages13
    ISBN (Print)9783319930398
    DOIs
    Publication statusPublished - 2018
    Event22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018 - Melbourne, Australia
    Duration: 3 Jun 20186 Jun 2018

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10939 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference22nd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2018
    Country/TerritoryAustralia
    CityMelbourne
    Period3/06/186/06/18

    Fingerprint

    Dive into the research topics of 'Pattern-mining based cryptanalysis of bloom filters for privacy-preserving record linkage'. Together they form a unique fingerprint.

    Cite this