Research And Implementation Of Time Series Classification Based On Semi-supervised Learning

Posted on:2012-06-01

Degree:Master

Type:Thesis

Country:China

Candidate:L X Wu

Full Text:PDF

GTID:2210330368987897

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Time series is widely employed in all areas of life, including speech recognition and financial management. The classification of time series is an important field of data mining. Traditional methods are similarity-based and model-based. These classification methods are supervised learning algorithms and need labeled time series to obtain a reliable classifier, however, it is difficult to obtain the labeled data. If only use the initial labeled for training, the accuracy rate of the obtained classifier is very low. But unlabeled time series is easy to obtain, therefore, combining with labeled and unlabeled data information to train classifier named as semi-supervised methods becomes the focus of the research.This paper focuses on the semi-supervised learning-based classification of time series. Considering the classification accuracy of the trained HMM is very low under the condition of a small amount of labeled time series, we discuss how to use the self-training iterative learning process to enlarge the labeled time series dataset, and train the HMM on the enlarged labeled dataset to get more accurate and reliable model. Moreover, we discuss ho w to use the co-training iterative learning process to enlarge the labeled dataset. In the co-training, HMM and nearest neighbor classification are used as two base classifiers. In each iteration, HMM and one nearest neighbor respectively select some unlabeled data to label. Because there are incorrect labeled data, the edit method based on rough set is introduced. Linear neighborhood propagation is also improved by using the clustering result of K-means based on rough set which makes the constructed neighbor graph more reasonable.Experimental results on the UCR time series dataset show that the accuracy is improved by using self-training and co-training. Taking synthetic control for example, when the number of labeled data of each category is 4, the accuracy is increased 8.11% and 15.19% respectively by using self-training and co-training. Meanwhile, improved LNP based on rough K-means clustering (K=4) increases 7.24% than LNP.

Keywords/Search Tags:

Semi-Supervised learning, Hidden Markov model, Self-Training, Co-Training, Linear neighborhood propagation

PDF Full Text Request

Related items

1	Hyperspectral Image Classification Based On Semi-supervised Collaboration-training Algorithm
2	Classification Of Remote Sensing Sea Ice Image With Collaborative Active Learning And Semi-supervised Learning
3	A Study On Burning State Recognition Of Clinker Based On Semi-supervised Independent Component Analysis And Hidden Markov Model
4	Research On Feature Extraction And Classification Algorithms Of Motor Imagery-based Brain-Computer Interface Based On EEG Using A Small Training Set
5	Modelling Of Near-infrared Spectroscopy Based On Semi-supervised Learning And Transfer Learning
6	Network Intrusion Detection Based On Co-training With Dynamic And Static Attributes In Complex Network Environments
7	Semantic Segmentation Method Of High-resolution Remote Sensing Images Based On Self-supervised Learning
8	Weihe River Water Quality-based Semi-supervised Learning Of Quantitative Remote Sensing Research
9	A family of robust second order training algorithms
10	Pretraining Method Of Medical Image Model Based On Self-Supervised Contrast Learning