Learning from corrupted binary labels via class-probability estimation

Aditya Krishna Menon, Brendan Van Rooyen, Cheng Soon Ong, Robert C. Williamson

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    165 Citations (Scopus)

    Abstract

    Many supervised learning problems involve learning from samples whose labels are corrupted in some way. For example, each label may be flipped with some constant probability (learning with label noise), or one may have a pool of unlabelled samples in lieu of negative samples (learning from positive and unlabelled data). This paper uses class-probability estimation to study these and other corruption processes belonging to the mutually contaminated distributions framework (Scott et al., 2013), with three conclusions. First, one can optimise balanced error and AUC without knowledge of the corruption parameters. Second, given estimates of the corruption parameters, one can minimise a range of classification risks. Third, one can estimate corruption parameters via a class-probability estimator (e.g. kernel logistic regression) trained solely on corrupted data. Experiments on label noise tasks corroborate our analysis.

    Original languageEnglish
    Title of host publication32nd International Conference on Machine Learning, ICML 2015
    EditorsFrancis Bach, David Blei
    PublisherInternational Machine Learning Society (IMLS)
    Pages125-134
    Number of pages10
    ISBN (Electronic)9781510810587
    Publication statusPublished - 2015
    Event32nd International Conference on Machine Learning, ICML 2015 - Lile, France
    Duration: 6 Jul 201511 Jul 2015

    Publication series

    Name32nd International Conference on Machine Learning, ICML 2015
    Volume1

    Conference

    Conference32nd International Conference on Machine Learning, ICML 2015
    Country/TerritoryFrance
    CityLile
    Period6/07/1511/07/15

    Fingerprint

    Dive into the research topics of 'Learning from corrupted binary labels via class-probability estimation'. Together they form a unique fingerprint.

    Cite this