Efficient pattern discovery in massive time series databases

Posted on:2007-12-15

Degree:Ph.D

Type:Dissertation

University:University of California, Riverside

Candidate:Wei, Li

Full Text:PDF

GTID:1448390005965529

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

Time series data is ubiquitous in human life, produced by almost every human endeavor in science, industry, medicine, and business. To make best use of this abundant source of data, researchers have been working all the way on algorithms that can extract useful information out of massive time series databases. Most work in this area has concentrated on traditional data mining tasks, such as clustering, classification, anomaly detection, etc. However, with time series being prevalent in more diverse domains, many new problems stand out by themselves and attract the attention of data mining researchers. In particular, some of the problems are from non-traditional time series domains, such as images, text, videos. Special consideration need to be taken to make data mining algorithms effective in such domains. In addition, because of the sheer amount of time series data, the efficiency of the algorithms is critical to the data mining community.; Surrounding the central idea of efficient pattern discovery in large time series databases, this dissertation proposes algorithms for various pattern discovery tasks. All the problems studied in this work are new problems which are introduced here for the first time. First, we introduce time series query filtering, the problem of monitoring streaming time series for a set of predefined patterns. A novel envelope-based lowerbounding technique is proposed to allow monitoring at higher bandwidths. We then present a general semi-supervised time series classification framework that constructs accurate classifiers with only a handful of labeled examples. In the second part of the dissertation, we turn our attention to shape data mining. An exact shape indexing technique is proposed which can handle rotation invariance with arbitrary representations and distance measures. Finally we introduce and formally formulate the problem of shape discord discovery, or finding the most unusual shapes. Throughout this work, we have been using our envelope-based lower bounding technique intensively, which speed up the data mining algorithms by orders of magnitude.

Keywords/Search Tags:

Time series, Data, Pattern discovery, Algorithms

PDF Full Text Request

Related items

1	Knowledge Discovery In Time Series
2	Research And Improvement Of An Algorithm For Time-Series Patterns Discovery
3	Pattern-Based Data Mining on Diverse Multimedia and Time Series Data
4	Research On Several Techniques In Time Series Data Mining
5	Research On The Similarity-Based Representation And Pattern Search Of Time Series
6	Research On Key Techniques Of Discriminative Patterns Discovery And Classification Methods Of Time Series
7	Online Classification And Rule Discovery For Time Series Data
8	Meaningful Rule Discovery and Adaptive Classification of Multi-Dimensional Time Series Data
9	Multi-modal Time Series Data Error Discovery Algorithm Based On Hybrid Attention Mechanism
10	Research On Data Mining Technology Of Pattern-based Similarity Search In Time Series Database