A theory of learning with corrupted labels

Brendan van Rooyen, Robert C. Williamson

    Research output: Contribution to journalArticlepeer-review

    29 Citations (Scopus)

    Abstract

    It is usual in machine learning theory to assume that the training and testing sets comprise of draws from the same distribution. This is rarely, if ever, true and one must admit the presence of corruption. There are many different types of corruption that can arise and as of yet there is no general means to compare the relative ease of learning in these settings. Such results are necessary if we are to make informed economic decisions regarding the acquisition of data. Here we begin to develop an abstract framework for tackling these problems. We present a generic method for learning from a fixed, known, reconstructible corruption, along with an analyses of its statistical properties. We demonstrate the utility of our framework via concrete novel results in solving supervised learning problems wherein the labels are corrupted, such as learning with noisy labels, semi-supervised learning and learning with partial labels.

    Original languageEnglish
    Pages (from-to)1-50
    Number of pages50
    JournalJournal of Machine Learning Research
    Volume18
    Publication statusPublished - 1 Jul 2018

    Fingerprint

    Dive into the research topics of 'A theory of learning with corrupted labels'. Together they form a unique fingerprint.

    Cite this