Parametric model-based clustering

Vladimir Nikulin*, Alex J. Smola

*Corresponding author for this work

    Research output: Contribution to journalConference articlepeer-review

    7 Citations (Scopus)

    Abstract

    Parametric, model-based algorithms learn generative models from the data, with each model corresponding to one particular cluster. Accordingly, the model-based partitional algorithm will select the most suitable model for any data object (Clustering step), and will recompute parametric models using data specifically from the corresponding clusters (Maximization step). This Clustering-Maximization framework have been widely used and have shown promising results in many applications including complex variable-length data. The paper proposes Experience-Innovation (EI) method as a natural extension of the Clustering-Maximization framework. This method includes 3 components: 1) keep the best past experience and make empirical likelihood trajectory monotonical as a result; 2) find a new model as a function of existing models so that the corresponding cluster will split existing clusters with bigger number of elements and smaller uniformity; 3) heuristical innovations, for example, several trials with random initial settings. Also, we introduce clustering regularisation based on the balanced complex of two conditions: 1) significance of any particular cluster; 2) difference between any 2 clusters. We illustrate effectiveness of the proposed methods using first-order Markov model in application to the large webtraffic dataset. The aim of the experiment is to explain and understand the way people interact with web sites.

    Original languageEnglish
    Article number22
    Pages (from-to)190-201
    Number of pages12
    JournalProceedings of SPIE - The International Society for Optical Engineering
    Volume5812
    DOIs
    Publication statusPublished - 2005
    EventData Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2005 - Orlando, FL, United States
    Duration: 28 Mar 200529 Mar 2005

    Fingerprint

    Dive into the research topics of 'Parametric model-based clustering'. Together they form a unique fingerprint.

    Cite this