TY - GEN
T1 - Improving hierarchical document signature performance by classifier combination
AU - Liao, Jieyi
AU - Mendis, B. Sumudu U.
AU - Manna, Sukanya
PY - 2010
Y1 - 2010
N2 - We present a classifier-combination experimental framework for part-of-speech (POS) tagging in which four different POS taggers are combined in order to get a better result for sentence similarity using Hierarchical Document Signature (HDS). It is important to abstract information available to form humanly accessible structures. The way people think and talk is hierarchical with limited information presented in any one sentence, and that information is always linked together to further information. As such, HDS is a significant way to represent sentences when finding their similarity. POS tagging plays an important role in HDS. But POS taggers available are not perfect in tagging words in a sentence and tend to tag words improperly if they are either not properly cased or do not match the corpus dataset by which these taggers are trained. Thus, different weighted voting strategies are used to overcome some of these drawbacks of these existing taggers. Comparisons between individual taggers and combined taggers under different voting strategies are made. Their results show that the combined taggers provide better results than the individual ones.
AB - We present a classifier-combination experimental framework for part-of-speech (POS) tagging in which four different POS taggers are combined in order to get a better result for sentence similarity using Hierarchical Document Signature (HDS). It is important to abstract information available to form humanly accessible structures. The way people think and talk is hierarchical with limited information presented in any one sentence, and that information is always linked together to further information. As such, HDS is a significant way to represent sentences when finding their similarity. POS tagging plays an important role in HDS. But POS taggers available are not perfect in tagging words in a sentence and tend to tag words improperly if they are either not properly cased or do not match the corpus dataset by which these taggers are trained. Thus, different weighted voting strategies are used to overcome some of these drawbacks of these existing taggers. Comparisons between individual taggers and combined taggers under different voting strategies are made. Their results show that the combined taggers provide better results than the individual ones.
KW - Different Tagging methods
KW - Hierarchical Document Signature
KW - Part-of-Speech Taggers
UR - http://www.scopus.com/inward/record.url?scp=78650217901&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-17537-4_84
DO - 10.1007/978-3-642-17537-4_84
M3 - Conference contribution
SN - 3642175368
SN - 9783642175367
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 695
EP - 702
BT - Neural Information Processing
T2 - 17th International Conference on Neural Information Processing, ICONIP 2010
Y2 - 22 November 2010 through 25 November 2010
ER -