A kernel method for the two-sample-problem

Arthur Gretton*, Karsten M. Borgwardt, Malte Rasch, Bernhard Schölkopf, Alexander J. Smola

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1143 Citations (Scopus)

    Abstract

    We propose two statistical tests to determine if two samples are from different distributions. Our test statistic is in both cases the distance between the means of the two samples mapped into a reproducing kernel Hilbert space (RKHS). The first test is based on a large deviation bound for the test statistic, while the second is based on the asymptotic distribution of this statistic. The test statistic can be computed in O(m 2) time. We apply our approach to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where our test performs strongly. We also demonstrate excellent performance when comparing distributions over graphs, for which no alternative tests currently exist.

    Original languageEnglish
    Title of host publicationAdvances in Neural Information Processing Systems 19 - Proceedings of the 2006 Conference
    Pages513-520
    Number of pages8
    Publication statusPublished - 2007
    Event20th Annual Conference on Neural Information Processing Systems, NIPS 2006 - Vancouver, BC, Canada
    Duration: 4 Dec 20067 Dec 2006

    Publication series

    NameAdvances in Neural Information Processing Systems
    ISSN (Print)1049-5258

    Conference

    Conference20th Annual Conference on Neural Information Processing Systems, NIPS 2006
    Country/TerritoryCanada
    CityVancouver, BC
    Period4/12/067/12/06

    Fingerprint

    Dive into the research topics of 'A kernel method for the two-sample-problem'. Together they form a unique fingerprint.

    Cite this