Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation**

Sarah Tan, Rich Caruana, Giles Hooker, Yin Lou

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    Black-box risk scoring models permeate our lives, yet are typically proprietary or opaque. We propose Distill-and-Compare, an approach to audit such models without probing the black-box model API or pre-defining features to audit. To gain insight into black-box models, we treat them as teachers, training transparent student models to mimic the risk scores assigned by the black-box models. We compare the mimic model trained with distillation to a second, un-distilled transparent model trained on ground truth outcomes, and use differences between the two models to gain insight into the black-box model. We demonstrate the approach on four data sets: COMPAS, Stop-and-Frisk, Chicago Police, and Lending Club. We also propose a statistical test to determine if a data set is missing key features used to train the black-box model. Our test finds that the ProPublica data is likely missing key feature(s) used in COMPAS. © 2018 ACM.
    Original languageEnglish
    Title of host publicationDistill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation
    Place of PublicationNew Orleans
    PublisherAssociation for Computing Machinery, Inc
    Pages303-310
    ISBN (Print)978-145036012-8
    DOIs
    Publication statusPublished - 2018
    Event1st AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018 - New Orleans, United States
    Duration: 1 Jan 2018 → …
    https://doi.org/10.1145/3278721.3278725

    Conference

    Conference1st AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018
    Country/TerritoryUnited States
    Period1/01/18 → …
    Other2 February 2018 through 3 February 2018
    Internet address

    Fingerprint

    Dive into the research topics of 'Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation**'. Together they form a unique fingerprint.

    Cite this