TY - JOUR
T1 - Making clustering in delay-vector space meaningful
AU - Chen, Jason R.
PY - 2007/4
Y1 - 2007/4
N2 - Sequential time series clustering is a technique used to extract important features from time series data. The method can be shown to be the process of clustering in the delay-vector space formalism used in the Dynamical Systems literature. Recently, the startling claim was made that sequential time series clustering is meaningless. This has important consequences for a significant amount of work in the literature, since such a claim invalidates these work's contribution. In this paper, we show that sequential time series clustering is not meaningless, and that the problem highlighted in these works stem from their use of the Euclidean distance metric as the distance measure in the delay-vector space. As a solution, we consider quite a general class of time series, and propose a regime based on two types of similarity that can exist between delay vectors, giving rise naturally to an alternative distance measure to Euclidean distance in the delay-vector space. We show that, using this alternative distance measure, sequential time series clustering can indeed be meaningful. We repeat a key experiment in the work on which the "meaningless" claim was based, and show that our method leads to a successful clustering outcome.
AB - Sequential time series clustering is a technique used to extract important features from time series data. The method can be shown to be the process of clustering in the delay-vector space formalism used in the Dynamical Systems literature. Recently, the startling claim was made that sequential time series clustering is meaningless. This has important consequences for a significant amount of work in the literature, since such a claim invalidates these work's contribution. In this paper, we show that sequential time series clustering is not meaningless, and that the problem highlighted in these works stem from their use of the Euclidean distance metric as the distance measure in the delay-vector space. As a solution, we consider quite a general class of time series, and propose a regime based on two types of similarity that can exist between delay vectors, giving rise naturally to an alternative distance measure to Euclidean distance in the delay-vector space. We show that, using this alternative distance measure, sequential time series clustering can indeed be meaningful. We repeat a key experiment in the work on which the "meaningless" claim was based, and show that our method leads to a successful clustering outcome.
KW - Clustering
KW - Delay space
KW - STS clustering
KW - Sequential time series clustering
KW - Time series
UR - http://www.scopus.com/inward/record.url?scp=34147140547&partnerID=8YFLogxK
U2 - 10.1007/s10115-006-0042-6
DO - 10.1007/s10115-006-0042-6
M3 - Article
SN - 0219-1377
VL - 11
SP - 369
EP - 385
JO - Knowledge and Information Systems
JF - Knowledge and Information Systems
IS - 3
ER -