Abstract
Consider the following problem: given sets of unlabeled observations, each set with known label proportions, predict the labels of another set of observations, possibly with known label propor-tions. This problem occurs in areas like e-commerce, politics, spam filtering and improper content detection. We present consistent estimators which can reconstruct the correct labels with high prob-ability in a uniform convergence sense. Experiments show that our method works well in practice.
Original language | English |
---|---|
Pages (from-to) | 2349-2374 |
Number of pages | 26 |
Journal | Journal of Machine Learning Research |
Volume | 10 |
Publication status | Published - 2009 |
Externally published | Yes |