Sampling table configurations for the hierarchical poisson-Dirichlet process

Changyou Chen*, Lan Du, Wray Buntine

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    27 Citations (Scopus)

    Abstract

    Hierarchical modeling and reasoning are fundamental in machine intelligence, and for this the two-parameter Poisson-Dirichlet Process (PDP) plays an important role. The most popular MCMC sampling algorithm for the hierarchical PDP and hierarchical Dirichlet Process is to conduct an incremental sampling based on the Chinese restaurant metaphor, which originates from the Chinese restaurant process (CRP). In this paper, with the same metaphor, we propose a new table representation for the hierarchical PDPs by introducing an auxiliary latent variable, called table indicator, to record which customer takes responsibility for starting a new table. In this way, the new representation allows full exchangeability that is an essential condition for a correct Gibbs sampling algorithm. Based on this representation, we develop a block Gibbs sampling algorithm, which can jointly sample the data item and its table contribution. We test this out on the hierarchical Dirichlet process variant of latent Dirichlet allocation (HDP-LDA) developed by Teh, Jordan, Beal and Blei. Experiment results show that the proposed algorithm outperforms their "posterior sampling by direct assignment" algorithm in both out-of-sample perplexity and convergence speed. The representation can be used with many other hierarchical PDP models.

    Original languageEnglish
    Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2011, Proceedings
    Pages296-311
    Number of pages16
    EditionPART 1
    DOIs
    Publication statusPublished - 2011
    EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011 - Athens, Greece
    Duration: 5 Sept 20119 Sept 2011

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    NumberPART 1
    Volume6911 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2011
    Country/TerritoryGreece
    CityAthens
    Period5/09/119/09/11

    Fingerprint

    Dive into the research topics of 'Sampling table configurations for the hierarchical poisson-Dirichlet process'. Together they form a unique fingerprint.

    Cite this