A scaled Bregman theorem with applications

Richard Nock, Aditya Krishna Menon, Cheng Soon Ong

    Research output: Contribution to journalConference articlepeer-review

    10 Citations (Scopus)

    Abstract

    Bregman divergences play a central role in the design and analysis of a range of machine learning algorithms through a handful of popular theorems. We present a new theorem which shows that "Bregman distortions" (employing a potentially non-convex generator) may be exactly re-written as a scaled Bregman divergence computed over transformed data. This property can be viewed from the standpoints of geometry (a scaled isometry with adaptive metrics) or convex optimization (relating generalized perspective transforms). Admissible distortions include geodesic distances on curved manifolds and projections or gauge-normalisation. Our theorem allows one to leverage to the wealth and convenience of Bregman divergences when analysing algorithms relying on the aforementioned Bregman distortions. We illustrate this with three novel applications of our theorem: a reduction from multi-class density ratio to class-probability estimation, a new adaptive projection free yet norm-enforcing dual norm mirror descent algorithm, and a reduction from clustering on flat manifolds to clustering on curved manifolds. Experiments on each of these domains validate the analyses and suggest that the scaled Bregman theorem might be a worthy addition to the popular handful of Bregman divergence properties that have been pervasive in machine learning.

    Original languageEnglish
    Pages (from-to)19-27
    Number of pages9
    JournalAdvances in Neural Information Processing Systems
    Publication statusPublished - 2016
    Event30th Annual Conference on Neural Information Processing Systems, NIPS 2016 - Barcelona, Spain
    Duration: 5 Dec 201610 Dec 2016

    Fingerprint

    Dive into the research topics of 'A scaled Bregman theorem with applications'. Together they form a unique fingerprint.

    Cite this