TY - GEN
T1 - Semantic Hierarchical Document Signature for determining sentence similarity
AU - Manna, Sukanya
AU - Gedeon, Tom
PY - 2010
Y1 - 2010
N2 - In this paper, we present a new approach that incorporates semantic information from a document, in the form of Hierarchical Document Signature (HDS), to measure semantic similarity between sentences. Due to variability of expressions of natural language, it is very essential to exploit the semantic properties of a document to accurately identify semantically similar sentences since sentences conveying the same fact or concept may be composed lexically and syntactically different. Inversely, sentences which are lexically common may not necessarily convey the same meaning. This poses a significant impact on many text mining applications performance where sentence-level judgment is involved. Our HDS uses the natural hierarchy of the document and represents it in a modularized form of document level to sentence level, sentence to word level; aggregating similarity components at the lower levels and propagating them to the next higher level to produce the final similarity between sentences. The evaluation of our HDS model has shown that it resembles the decision making process as done by human to a greater extent than different vector space models which only uses 'bag of words' concept.
AB - In this paper, we present a new approach that incorporates semantic information from a document, in the form of Hierarchical Document Signature (HDS), to measure semantic similarity between sentences. Due to variability of expressions of natural language, it is very essential to exploit the semantic properties of a document to accurately identify semantically similar sentences since sentences conveying the same fact or concept may be composed lexically and syntactically different. Inversely, sentences which are lexically common may not necessarily convey the same meaning. This poses a significant impact on many text mining applications performance where sentence-level judgment is involved. Our HDS uses the natural hierarchy of the document and represents it in a modularized form of document level to sentence level, sentence to word level; aggregating similarity components at the lower levels and propagating them to the next higher level to produce the final similarity between sentences. The evaluation of our HDS model has shown that it resembles the decision making process as done by human to a greater extent than different vector space models which only uses 'bag of words' concept.
UR - http://www.scopus.com/inward/record.url?scp=78549259684&partnerID=8YFLogxK
U2 - 10.1109/FUZZY.2010.5584332
DO - 10.1109/FUZZY.2010.5584332
M3 - Conference contribution
SN - 9781424469208
T3 - 2010 IEEE World Congress on Computational Intelligence, WCCI 2010
BT - 2010 IEEE World Congress on Computational Intelligence, WCCI 2010
T2 - 2010 6th IEEE World Congress on Computational Intelligence, WCCI 2010
Y2 - 18 July 2010 through 23 July 2010
ER -