TY - JOUR

T1 - Permutation tests for equality of distributions in high-dimensional settings

AU - Hall, Peter

AU - Tajvidi, Nader

PY - 2002

Y1 - 2002

N2 - Motivated by applications in high-dimensional settings, we suggest a test of the hypothesis H0 that two sampled distributions are identical. It is assumed that two independent datasets are drawn from the respective populations, which may be very general. In particular, the distributions may be multivariate or infinite-dimensional, in the latter case representing, for example, the distributions of random functions from one Euclidean space to another. Our test uses a measure of distance between data. This measure should be symmetric but need not satisfy the triangle inequality, so it is not essential that it be a metric. The test is based on ranking the pooled dataset, with respect to the distance and relative to any fixed data value, and repeating this operation for each fixed datum. A permutation argument enables a critical point to be chosen such that the test has concisely known significance level, conditional on the set of all pairwise distances.

AB - Motivated by applications in high-dimensional settings, we suggest a test of the hypothesis H0 that two sampled distributions are identical. It is assumed that two independent datasets are drawn from the respective populations, which may be very general. In particular, the distributions may be multivariate or infinite-dimensional, in the latter case representing, for example, the distributions of random functions from one Euclidean space to another. Our test uses a measure of distance between data. This measure should be symmetric but need not satisfy the triangle inequality, so it is not essential that it be a metric. The test is based on ranking the pooled dataset, with respect to the distance and relative to any fixed data value, and repeating this operation for each fixed datum. A permutation argument enables a critical point to be chosen such that the test has concisely known significance level, conditional on the set of all pairwise distances.

KW - Bootstrap

KW - Functional data analysis

KW - Hypergeometric distribution

KW - Hypothesis test

KW - Local alternative

KW - Multivariate analysis

KW - Rank test

KW - Resampling

UR - http://www.scopus.com/inward/record.url?scp=22944460361&partnerID=8YFLogxK

U2 - 10.1093/biomet/89.2.359

DO - 10.1093/biomet/89.2.359

M3 - Article

SN - 0006-3444

VL - 89

SP - 359

EP - 374

JO - Biometrika

JF - Biometrika

IS - 2

ER -