Correcting sample selection bias by unlabeled data

Jiayuan Huang*, Alexander J. Smola, Arthur Gretton, Karsten M. Borgwardt, Bernhard Schölkopf

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1042 Citations (Scopus)

    Abstract

    We consider the scenario where training and test data are drawn from different distributions, commonly referred to as sample selection bias. Most algorithms for this setting try to first recover sampling distributions and then make appropriate corrections based on the distribution estimate. We present a nonparametric method which directly produces resampling weights without distribution estimation. Our method works by matching distributions between training and testing sets in feature space. Experimental results demonstrate that our method works well in practice.

    Original languageEnglish
    Title of host publicationAdvances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference
    Pages601-608
    Number of pages8
    Publication statusPublished - 2007
    Event20th Annual Conference on Neural Information Processing Systems, NIPS 2006 - Vancouver, BC, Canada
    Duration: 4 Dec 20067 Dec 2006

    Publication series

    NameAdvances in Neural Information Processing Systems
    ISSN (Print)1049-5258

    Conference

    Conference20th Annual Conference on Neural Information Processing Systems, NIPS 2006
    Country/TerritoryCanada
    CityVancouver, BC
    Period4/12/067/12/06

    Fingerprint

    Dive into the research topics of 'Correcting sample selection bias by unlabeled data'. Together they form a unique fingerprint.

    Cite this