Abstract
We consider the classical statistical learning/regression problem, when the value of a real random variable Y is to be predicted based on the observation of another random variable X. Given a class of functions F and a sample of independent copies of (X, Y), one needs to choose a function f from F such that f(X) approximates Y as well as possible, in the mean-squared sense. We introduce a new procedure, the so-called median-of-means tournament, that achieves the optimal tradeoff between accuracy and confidence under minimal assumptions, and in particular outperforms classical methods based on empirical risk minimization.
Original language | English |
---|---|
Pages (from-to) | 925-965 |
Number of pages | 41 |
Journal | Journal of the European Mathematical Society |
Volume | 22 |
Issue number | 3 |
DOIs | |
Publication status | Published - 2020 |