TY - GEN
T1 - Extracting significant phrases from text
AU - Lui, Yuan J.
AU - Brent, Richard
AU - Calinescu, Ani
PY - 2007
Y1 - 2007
N2 - Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs at least as well as other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000's AutoSummarizefeature.
AB - Prospective readers can quickly determine whether a document is relevant to their information need if the significant phrases (or keyphrases) in this document are provided. Although keyphrases are useful, not many documents have keyphrases assigned to them, and manually assigning keyphrases to existing documents is costly. Therefore, there is a need for automatic keyphrase extraction. This paper introduces a new domain independent keyphrase extraction algorithm. The algorithm approaches the problem of keyphrase extraction as a classification task, and uses a combination of statistical and computational linguistics techniques, a new set of attributes, and a new learning method to distinguish keyphrases from non-keyphrases. The experiments indicate that this algorithm performs at least as well as other keyphrase extraction tools and that it significantly outperforms Microsoft Word 2000's AutoSummarizefeature.
UR - http://www.scopus.com/inward/record.url?scp=35248898318&partnerID=8YFLogxK
U2 - 10.1109/AINAW.2007.180
DO - 10.1109/AINAW.2007.180
M3 - Conference contribution
SN - 0769528473
SN - 9780769528472
T3 - Proceedings - 21st International Conference on Advanced Information Networking and Applications Workshops/Symposia, AINAW'07
SP - 361
EP - 366
BT - Proceedings - 21st International Conference on Advanced Information Networking and ApplicationsWorkshops/Symposia, AINAW'07
T2 - 21st International Conference on Advanced Information Networking and ApplicationsWorkshops/Symposia, AINAW'07
Y2 - 21 May 2007 through 23 May 2007
ER -