《An Online Algorithm for Segmenting Time Series》.pdf
文本预览下载声明
An Online Algorithm for Segmenting Time Series
Eamonn Keogh Selina Chu David Hart Michael Pazzani
Department of Information and Computer Science
University of California, Irvine, California 92697 USA
{eamonn, selina, dhart, pazzani}@
Abstract
In recent years, there has been an explosion of interest in mining time series databases. As with most
computer science problems, representation of the data is the key to efficient and effective solutions. One of
the most commonly used representations is piecewise linear approximation. This representation has been
used by various researchers to support clustering, classification, indexing and association rule mining of
time series data. A variety of algorithms have been proposed to obtain this representation, with several
algorithms having been independently rediscovered several times. In this paper, we undertake the first
extensive review and empirical comparison of all proposed techniques. We show that all these algorithms
have fatal flaws from a data mining perspective. We introduce a novel algorithm that we empirically show
to be superior to all others in the literature.
1. Introduction
In recent years, there has been an explosion of interest in mining time series databases. As with
most computer science problems, representation of the data is the key to efficient and effective
solutions. Several high level representations of time series have been proposed, including Fourier
Transforms [1,13], Wavelets [4], Symbolic Mappings [2, 5, 24] and Piecewise Linear
Representation (PLR). In this work, we confine our attention to PLR, perhaps the most frequently
used representation [8, 10, 12, 14, 15, 16, 17, 18, 20, 21, 22, 25
显示全部