Maximum likelihood learningwith arbitrary treewidth via fast-mixing parameter sets

Justin Domke*

*Corresponding author for this work

    Research output: Contribution to journalConference articlepeer-review

    1 Citation (Scopus)

    Abstract

    Inference is typically intractable in high-treewidth undirected graphical models, making maximum likelihood learning a challenge. One way to overcome this is to restrict parameters to a tractable set, most typically the set of tree-structured parameters. This paper explores an alternative notion of a tractable set, namely a set of "fast-mixing parameters" whereMarkov chainMonte Carlo (MCMC) inference can be guaranteed to quickly converge to the stationary distribution. While it is common in practice to approximate the likelihood gradient using samples obtained fromMCMC, such procedures lack theoretical guarantees. This paper proves that for any exponential family with bounded sufficient statistics, (not just graphical models) when parameters are constrained to a fast-mixing set, gradient descent with gradients approximated by sampling will approximate the maximum likelihood solution inside the set with high-probability. When unregularized, to find a solution ∈-accurate in log-likelihood requires a total amount of effort cubic in 1/∈, disregarding logarithmic factors. When ridge-regularized, strong convexity allows a solution ∈-accurate in parameter distance with effort quadratic in 1/∈. Both of these provide of a fully-polynomial time randomized approximation scheme.

    Original languageEnglish
    Pages (from-to)874-882
    Number of pages9
    JournalAdvances in Neural Information Processing Systems
    Volume2015-January
    Publication statusPublished - 2015
    Event29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada
    Duration: 7 Dec 201512 Dec 2015

    Fingerprint

    Dive into the research topics of 'Maximum likelihood learningwith arbitrary treewidth via fast-mixing parameter sets'. Together they form a unique fingerprint.

    Cite this