TY - GEN
T1 - Significant term extraction by higher order SVD
AU - Manna, Sukanya
AU - Petres, Zoltán
AU - Gedeon, Tom
PY - 2009
Y1 - 2009
N2 - In this paper, we present a novel method for term importance, called Tensor Term Indexing (TTI). This extracts significant terms from a document as well as a coherent collection of document set. The basic idea of this approach is to represent the whole document collection in a Term-Sentence-Document tensor and employs higher-order singular value decomposition (HOSVD) for important term extraction. TTI uses the lower rank approximation technique to reduce noise by eliminating anecdotal terms, to mitigate synonymy by merging the dimensions associated with terms that have similar meanings, and to mitigates polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Our evaluation shows that that TTI model can extract significant terms relevant to a topic from a small number of documents which Term Frequency and Inverse Document Frequency (tfidf) cannot.
AB - In this paper, we present a novel method for term importance, called Tensor Term Indexing (TTI). This extracts significant terms from a document as well as a coherent collection of document set. The basic idea of this approach is to represent the whole document collection in a Term-Sentence-Document tensor and employs higher-order singular value decomposition (HOSVD) for important term extraction. TTI uses the lower rank approximation technique to reduce noise by eliminating anecdotal terms, to mitigate synonymy by merging the dimensions associated with terms that have similar meanings, and to mitigates polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Our evaluation shows that that TTI model can extract significant terms relevant to a topic from a small number of documents which Term Frequency and Inverse Document Frequency (tfidf) cannot.
UR - http://www.scopus.com/inward/record.url?scp=69849094114&partnerID=8YFLogxK
U2 - 10.1109/SAMI.2009.4956610
DO - 10.1109/SAMI.2009.4956610
M3 - Conference contribution
SN - 9781424438020
T3 - SAMI 2009 - 7th International Symposium on Applied Machine Intelligence and Informatics, Proceedings
SP - 63
EP - 68
BT - SAMI 2009 - 7th International Symposium on Applied Machine Intelligence and Informatics, Proceedings
T2 - 7th International Symposium on Applied Machine Intelligence and Informatics, SAMI 2009
Y2 - 30 January 2009 through 31 January 2009
ER -