TY - GEN
T1 - Threshold-based clustering for intrusion detection systems
AU - Nikulin, Vladimir
PY - 2006
Y1 - 2006
N2 - Signature-based intrusion detection systems look for known, suspicious patterns in the input data, In this paper we explore compression of labeled empirical data using threshold-based clustering with regularization. The main target of clustering is to compress training dataset to the limited number of signatures, and to minimize the number of comparisons that are necessary to determine the status of the input event as a result. Essentially, the process of clustering includes merging of the clusters which are close enough. As a consequence, we will reduce original dataset to the limited number of labeled centroids. In a complex with k-nearest-neighbor (kNN) method, this set of centroids may be used as a multiclass classifier. Clearly, different attributes have different importance depending on the particular training database. This importance may be regulated in the definition of the distance using linear weight coefficients. The paper introduces special procedure to estimate above weight coefficients. The experiments on the KDD-99 intrusion detection dataset have confirmed effectiveness of the proposed methods.
AB - Signature-based intrusion detection systems look for known, suspicious patterns in the input data, In this paper we explore compression of labeled empirical data using threshold-based clustering with regularization. The main target of clustering is to compress training dataset to the limited number of signatures, and to minimize the number of comparisons that are necessary to determine the status of the input event as a result. Essentially, the process of clustering includes merging of the clusters which are close enough. As a consequence, we will reduce original dataset to the limited number of labeled centroids. In a complex with k-nearest-neighbor (kNN) method, this set of centroids may be used as a multiclass classifier. Clearly, different attributes have different importance depending on the particular training database. This importance may be regulated in the definition of the distance using linear weight coefficients. The paper introduces special procedure to estimate above weight coefficients. The experiments on the KDD-99 intrusion detection dataset have confirmed effectiveness of the proposed methods.
KW - Distance-based clustering
KW - Intrusion detection
KW - k-nearest-neighbor method
UR - http://www.scopus.com/inward/record.url?scp=33747366161&partnerID=8YFLogxK
U2 - 10.1117/12.665326
DO - 10.1117/12.665326
M3 - Conference contribution
AN - SCOPUS:33747366161
SN - 0819462977
SN - 9780819462978
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006
T2 - Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006
Y2 - 17 April 2006 through 18 April 2006
ER -