Abstract
Csiszár's f-divergence is a way to measure the similarity of two probability distributions. We study the extension of f-divergence to more than two distributions to measure their joint similarity. By exploiting classical results from the comparison of experiments literature we prove the resulting divergence satisfies all the same properties as the traditional binary one. Considering the multidistribution case actually makes the proofs simpler. The key to these results is a formal bridge between these multidistribution f-divergences and Bayes risks for multiclass classification problems.
Original language | English |
---|---|
Pages (from-to) | 28.1-28.20 |
Journal | Journal of Machine Learning Research |
Volume | 23 |
Publication status | Published - 2012 |
Event | 25th Annual Conference on Learning Theory, COLT 2012 - Edinburgh, United Kingdom Duration: 25 Jun 2012 → 27 Jun 2012 |