Stochastic analysis of lexical and semantic enhanced structural language model

Shaojun Wang*, Shaomin Wang*, Li Cheng, Russell Greiner, Dale Schuurmans

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

In this paper, we present a directed Markov random field model that integrates trigram models, structural language models (SLM) and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. The SLM is essentially a generalization of shift-reduce probabilistic push-down automata thus more complex and powerful than probabilistic context free grammars (PCFGs). The added context-sensitiveness due to trigrams and PLSAs and violation of tree structure in the topology of the underlying random field model make the inference and parameter estimation problems plausibly intractable, however the analysis of the behavior of the lexical and semantic enhanced structural language model leads to a generalized inside-outside algorithm and thus to rigorous exact EM type re-estimation of the composite language model parameters.

Original languageEnglish
Title of host publicationGrammatical Inference
Subtitle of host publicationAlgorithms and Applications - 8th International Colloquium, ICGI 2006, Proceedings
PublisherSpringer Verlag
Pages97-111
Number of pages15
ISBN (Print)3540452648, 9783540452645
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event8th International Colloquium on Grammatical Inference, ICGI 2006 - Tokyo, Japan
Duration: 20 Sept 200622 Sept 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4201 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Colloquium on Grammatical Inference, ICGI 2006
Country/TerritoryJapan
CityTokyo
Period20/09/0622/09/06

Fingerprint

Dive into the research topics of 'Stochastic analysis of lexical and semantic enhanced structural language model'. Together they form a unique fingerprint.

Cite this