TY - GEN
T1 - Efficient feature selection for polyp detection
AU - Seghouane, Abd Krim
AU - Ong, Ju Lynn
PY - 2010
Y1 - 2010
N2 - Computed tomographic colonography (CTC) is a promising alternative to traditional invasive colonoscopic methods used in the detection and removal of cancerous growths, or polyps in the colon. Existing algorithms for CTC typically use a classifier to discriminate between true and false positives generated by a polyp candidate detection system. However, these classifiers often suffer from a phenomenon termed the curse of dimensionality, whereby there is a marked degradation in the performance of a classifier as the number of features used in the classifier is increased. In addition an increase in the number of features used also contributes to an increase in computational complexity and demands on storage space. This paper demonstrates the benefits of feature selection with the aim at increasing specificity while preserving sensitivity in a polyp detection system. It also compares the performances of an individual (F-score) and mutual information (MI) method for feature selection on a polyp candidate database, in order to select a subset of features for optimum CAD performance. Experimental results show that the performance of SVM+MI seems to be better for a small number of features used, but the SVM+Fscore method seems to dominate when using the 30-50 best ranked features. On the whole, the AUC measures are able to reach 0.8-0.85 for the top ranked 20-40 features using MI or F-score methods compared with 0.65-0.7 when using all 100 features in the worst-case scenario.
AB - Computed tomographic colonography (CTC) is a promising alternative to traditional invasive colonoscopic methods used in the detection and removal of cancerous growths, or polyps in the colon. Existing algorithms for CTC typically use a classifier to discriminate between true and false positives generated by a polyp candidate detection system. However, these classifiers often suffer from a phenomenon termed the curse of dimensionality, whereby there is a marked degradation in the performance of a classifier as the number of features used in the classifier is increased. In addition an increase in the number of features used also contributes to an increase in computational complexity and demands on storage space. This paper demonstrates the benefits of feature selection with the aim at increasing specificity while preserving sensitivity in a polyp detection system. It also compares the performances of an individual (F-score) and mutual information (MI) method for feature selection on a polyp candidate database, in order to select a subset of features for optimum CAD performance. Experimental results show that the performance of SVM+MI seems to be better for a small number of features used, but the SVM+Fscore method seems to dominate when using the 30-50 best ranked features. On the whole, the AUC measures are able to reach 0.8-0.85 for the top ranked 20-40 features using MI or F-score methods compared with 0.65-0.7 when using all 100 features in the worst-case scenario.
KW - Computed tomography (CT)
KW - Feature selection
KW - Mutual information
KW - Support vector classifier
UR - http://www.scopus.com/inward/record.url?scp=78651107350&partnerID=8YFLogxK
U2 - 10.1109/ICIP.2010.5648923
DO - 10.1109/ICIP.2010.5648923
M3 - Conference contribution
SN - 9781424479948
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 2285
EP - 2288
BT - 2010 IEEE International Conference on Image Processing, ICIP 2010 - Proceedings
T2 - 2010 17th IEEE International Conference on Image Processing, ICIP 2010
Y2 - 26 September 2010 through 29 September 2010
ER -