Abstract
Black-box risk scoring models permeate our lives, yet are typically proprietary or opaque. We propose Distill-and-Compare, an approach to audit such models without probing the black-box model API or pre-defining features to audit. To gain insight into black-box models, we treat them as teachers, training transparent student models to mimic the risk scores assigned by the black-box models. We compare the mimic model trained with distillation to a second, un-distilled transparent model trained on ground truth outcomes, and use differences between the two models to gain insight into the black-box model. We demonstrate the approach on four data sets: COMPAS, Stop-and-Frisk, Chicago Police, and Lending Club. We also propose a statistical test to determine if a data set is missing key features used to train the black-box model. Our test finds that the ProPublica data is likely missing key feature(s) used in COMPAS. © 2018 ACM.
Original language | English |
---|---|
Title of host publication | Distill-and-Compare: Auditing Black-Box Models Using Transparent Model Distillation |
Place of Publication | New Orleans |
Publisher | Association for Computing Machinery, Inc |
Pages | 303-310 |
ISBN (Print) | 978-145036012-8 |
DOIs | |
Publication status | Published - 2018 |
Event | 1st AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018 - New Orleans, United States Duration: 1 Jan 2018 → … https://doi.org/10.1145/3278721.3278725 |
Conference
Conference | 1st AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018 |
---|---|
Country/Territory | United States |
Period | 1/01/18 → … |
Other | 2 February 2018 through 3 February 2018 |
Internet address |