TY - GEN
T1 - Topic chains for understanding a news corpus
AU - Kim, Dongwoo
AU - Oh, Alice
PY - 2011
Y1 - 2011
N2 - The Web is a great resource and archive of news articles for the world. We present a framework, based on probabilistic topic modeling, for uncovering the meaningful structure and trends of important topics and issues hidden within the news archives on the Web. Central in the framework is a topic chain, a temporal organization of similar topics. We experimented with various topic similarity metrics and present our insights on how best to construct topic chains. We discuss how to interpret the topic chains to understand the news corpus by looking at long-term topics, temporary issues, and shifts of focus in the topic chains. We applied our framework to nine months of Korean Web news corpus and present our findings.
AB - The Web is a great resource and archive of news articles for the world. We present a framework, based on probabilistic topic modeling, for uncovering the meaningful structure and trends of important topics and issues hidden within the news archives on the Web. Central in the framework is a topic chain, a temporal organization of similar topics. We experimented with various topic similarity metrics and present our insights on how best to construct topic chains. We discuss how to interpret the topic chains to understand the news corpus by looking at long-term topics, temporary issues, and shifts of focus in the topic chains. We applied our framework to nine months of Korean Web news corpus and present our findings.
UR - http://www.scopus.com/inward/record.url?scp=79952268222&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-19437-5_13
DO - 10.1007/978-3-642-19437-5_13
M3 - Conference contribution
SN - 9783642194368
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 163
EP - 176
BT - Computational Linguistics and Intelligent Text Processing - 12th International Conference, CICLing 2011, Proceedings
T2 - 12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011
Y2 - 20 February 2011 through 26 February 2011
ER -