On Clustering histograms with k-Means by using mixed α-divergences

Frank Nielsen*, Richard Nock, Shun Ichi Amari

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    24 Citations (Scopus)

    Abstract

    Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α-divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences.

    Original languageEnglish
    Pages (from-to)3273-3301
    Number of pages29
    JournalEntropy
    Volume16
    Issue number6
    DOIs
    Publication statusPublished - 2014

    Fingerprint

    Dive into the research topics of 'On Clustering histograms with k-Means by using mixed α-divergences'. Together they form a unique fingerprint.

    Cite this