TY - JOUR
T1 - Feature selection with redundancy-constrained class separability
AU - Zhou, Luping
AU - Wang, Lei
AU - Shen, Chunhua
PY - 2010/5
Y1 - 2010/5
N2 - Scatter-matrix-based class separability is a simple and efficient feature selection criterion in the literature. However, the conventional trace-based formulation does not take feature redundancy into account and is prone to selecting a set of discriminative but mutually redundant features. In this brief, we first theoretically prove that in the context of this trace-based criterion the existence of sufficiently correlated features can always prevent selecting the optimal feature set. Then, on top of this criterion, we propose the redundancy-constrained feature selection (RCFS). To ensure the algorithm's efficiency and scalability, we study the characteristic of the constraints with which the resulted constrained 01 optimization can be efficiently and globally solved. By using the totally unimodular (TUM) concept in integer programming, a necessary condition for such constraints is derived. This condition reveals an interesting special case in which qualified redundancy constraints can be conveniently generated via a clustering of features. We study this special case and develop an efficient feature selection approach based on Dinkelbach's algorithm. Experiments on benchmark data sets demonstrate the superior performance of our approach to those without redundancy constraints.
AB - Scatter-matrix-based class separability is a simple and efficient feature selection criterion in the literature. However, the conventional trace-based formulation does not take feature redundancy into account and is prone to selecting a set of discriminative but mutually redundant features. In this brief, we first theoretically prove that in the context of this trace-based criterion the existence of sufficiently correlated features can always prevent selecting the optimal feature set. Then, on top of this criterion, we propose the redundancy-constrained feature selection (RCFS). To ensure the algorithm's efficiency and scalability, we study the characteristic of the constraints with which the resulted constrained 01 optimization can be efficiently and globally solved. By using the totally unimodular (TUM) concept in integer programming, a necessary condition for such constraints is derived. This condition reveals an interesting special case in which qualified redundancy constraints can be conveniently generated via a clustering of features. We study this special case and develop an efficient feature selection approach based on Dinkelbach's algorithm. Experiments on benchmark data sets demonstrate the superior performance of our approach to those without redundancy constraints.
KW - Class separability measure
KW - Feature redundancy
KW - Feature selection
KW - Fractional programming
KW - Integer programming
UR - http://www.scopus.com/inward/record.url?scp=77951937407&partnerID=8YFLogxK
U2 - 10.1109/TNN.2010.2044189
DO - 10.1109/TNN.2010.2044189
M3 - Article
SN - 1045-9227
VL - 21
SP - 853
EP - 858
JO - IEEE Transactions on Neural Networks
JF - IEEE Transactions on Neural Networks
IS - 5
M1 - 5428785
ER -