Significant term extraction by higher order SVD

Sukanya Manna*, Zoltán Petres, Tom Gedeon

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    2 Citations (Scopus)

    Abstract

    In this paper, we present a novel method for term importance, called Tensor Term Indexing (TTI). This extracts significant terms from a document as well as a coherent collection of document set. The basic idea of this approach is to represent the whole document collection in a Term-Sentence-Document tensor and employs higher-order singular value decomposition (HOSVD) for important term extraction. TTI uses the lower rank approximation technique to reduce noise by eliminating anecdotal terms, to mitigate synonymy by merging the dimensions associated with terms that have similar meanings, and to mitigates polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Our evaluation shows that that TTI model can extract significant terms relevant to a topic from a small number of documents which Term Frequency and Inverse Document Frequency (tfidf) cannot.

    Original languageEnglish
    Title of host publicationSAMI 2009 - 7th International Symposium on Applied Machine Intelligence and Informatics, Proceedings
    Pages63-68
    Number of pages6
    DOIs
    Publication statusPublished - 2009
    Event7th International Symposium on Applied Machine Intelligence and Informatics, SAMI 2009 - Herl'any, Slovakia
    Duration: 30 Jan 200931 Jan 2009

    Publication series

    NameSAMI 2009 - 7th International Symposium on Applied Machine Intelligence and Informatics, Proceedings

    Conference

    Conference7th International Symposium on Applied Machine Intelligence and Informatics, SAMI 2009
    Country/TerritorySlovakia
    CityHerl'any
    Period30/01/0931/01/09

    Fingerprint

    Dive into the research topics of 'Significant term extraction by higher order SVD'. Together they form a unique fingerprint.

    Cite this