TY - JOUR
T1 - Threshold-based clustering with merging and regularization in application to network intrusion detection
AU - Nikulin, V.
PY - 2006/11/15
Y1 - 2006/11/15
N2 - Signature-based intrusion detection systems look for known, suspicious patterns in the input data. In this paper we explore compression of labeled empirical data using threshold-based clustering with regularization. The main target of clustering is to compress training dataset to the limited number of signatures, and to minimize the number of comparisons that are necessary to determine the status of the input event as a result. Essentially, the process of clustering includes merging of the clusters which are close enough. As a consequence, we will reduce original dataset to the limited number of labeled centroids. In a complex with k-nearest-neighbor (kNN) method, this set of centroids may be used as a multi-class classifier. The experiments on the KDD-99 intrusion detection dataset have confirmed effectiveness of the above procedure.
AB - Signature-based intrusion detection systems look for known, suspicious patterns in the input data. In this paper we explore compression of labeled empirical data using threshold-based clustering with regularization. The main target of clustering is to compress training dataset to the limited number of signatures, and to minimize the number of comparisons that are necessary to determine the status of the input event as a result. Essentially, the process of clustering includes merging of the clusters which are close enough. As a consequence, we will reduce original dataset to the limited number of labeled centroids. In a complex with k-nearest-neighbor (kNN) method, this set of centroids may be used as a multi-class classifier. The experiments on the KDD-99 intrusion detection dataset have confirmed effectiveness of the above procedure.
KW - Distance-based clustering
KW - Intrusion detection
KW - k-nearest-neighbor method
UR - http://www.scopus.com/inward/record.url?scp=33750319953&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2005.11.015
DO - 10.1016/j.csda.2005.11.015
M3 - Article
SN - 0167-9473
VL - 51
SP - 1184
EP - 1196
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
IS - 2
ER -