TY - GEN
T1 - Hierarchical document signature
T2 - 2009 IEEE International Conference on Fuzzy Systems
AU - Manna, Sukanya
AU - Mendis, B. Sumudu Udaya
AU - Gedeon, Tom
PY - 2009
Y1 - 2009
N2 - We develop document computing procedures for the analysis of discourse structures within a document, represented by hierarchical document signatures. A signature is a string of data characterizing a certain case (e.g. characteristics of a sentence in case of a document). The place of the individual data is fixed within the string, it holds a local value semantics. Fuzzy granulation is a semantic background technique for all kinds of information which originates from human estimation or recorded by human valuation of numerical data. For analysis of such data the development of special procedures is suggested, different from the usual statistical methods. We used a form of fuzzy signature, called hierarchical document signature to modularize an unstructured document in a hierarchical manner, from Document level to sentence level, sentence level to attribute level and then to word level. We used occurrence of words as the information of the lowest module to find the similarity among the next higher module by aggregating the signature values giving sentence pair coherence.
AB - We develop document computing procedures for the analysis of discourse structures within a document, represented by hierarchical document signatures. A signature is a string of data characterizing a certain case (e.g. characteristics of a sentence in case of a document). The place of the individual data is fixed within the string, it holds a local value semantics. Fuzzy granulation is a semantic background technique for all kinds of information which originates from human estimation or recorded by human valuation of numerical data. For analysis of such data the development of special procedures is suggested, different from the usual statistical methods. We used a form of fuzzy signature, called hierarchical document signature to modularize an unstructured document in a hierarchical manner, from Document level to sentence level, sentence level to attribute level and then to word level. We used occurrence of words as the information of the lowest module to find the similarity among the next higher module by aggregating the signature values giving sentence pair coherence.
KW - Aggregation
KW - Document signature
KW - Fuzzy measure
KW - Fuzzy signatures
KW - Sentence similarity
KW - Vector valued fuzzy set
UR - http://www.scopus.com/inward/record.url?scp=71249139807&partnerID=8YFLogxK
U2 - 10.1109/FUZZY.2009.5277054
DO - 10.1109/FUZZY.2009.5277054
M3 - Conference contribution
SN - 9781424435975
T3 - IEEE International Conference on Fuzzy Systems
SP - 1083
EP - 1088
BT - 2009 IEEE International Conference on Fuzzy Systems - Proceedings
Y2 - 20 August 2009 through 24 August 2009
ER -