Topic chains for understanding a news corpus

Dongwoo Kim*, Alice Oh

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

39 Citations (Scopus)

Abstract

The Web is a great resource and archive of news articles for the world. We present a framework, based on probabilistic topic modeling, for uncovering the meaningful structure and trends of important topics and issues hidden within the news archives on the Web. Central in the framework is a topic chain, a temporal organization of similar topics. We experimented with various topic similarity metrics and present our insights on how best to construct topic chains. We discuss how to interpret the topic chains to understand the news corpus by looking at long-term topics, temporary issues, and shifts of focus in the topic chains. We applied our framework to nine months of Korean Web news corpus and present our findings.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 12th International Conference, CICLing 2011, Proceedings
Pages163-176
Number of pages14
EditionPART 2
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011 - Tokyo, Japan
Duration: 20 Feb 201126 Feb 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6609 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011
Country/TerritoryJapan
CityTokyo
Period20/02/1126/02/11

Fingerprint

Dive into the research topics of 'Topic chains for understanding a news corpus'. Together they form a unique fingerprint.

Cite this