Data Dimensionality Reduction And Classification Algorithms

Posted on:2017-04-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhou

Full Text:PDF

GTID:2180330482982361

Subject:Communication and Information System

Abstract/Summary:

With the development of information technology, huge amounts of data have constantly sprung up, which pushes forward the theory of machine learning. The higher dimension of example data causes the more difficulty of data storage, the larger calculation of data, and in addition, the more occurance of the characteristics of noise or redundancy in the example data. Therefore, how to reduce the dimensionality of high-dimensional data, avoid â€œthe curse of dimensionalityâ€ problem and improve the classification accuracy of data has become a hot issue in the field of machine learning.Non-negative matrix factorization(NMF) is a matrix decomposition algorithm. The algorithm constraints all elements to be nonnegative in the matrixs, which include the ones underdecomposed and obtained by factorization. The non-negative constraint of NMF has explicit physical significance and makes it widely concerned as a dimension reduction algorithm. At the same time, semi-supervised learning can combine limited labeled examples data and plenty of unlabeled ones for effectively learning, which overcomes the shortage of labeled examples in supervised learning algorithm and thus improves the accuracy of the classification. Therefore, it is widely used in image classification, text classification and e-mail classification.Firstly, in view of NMF and semi-supervised learning, the thesis proposes a semi-supervised learning algorithm based on non-negative matrix factorization and consistency of learning. Secondly, a novel semi-supervised learning approach based on constrained nonnegative matrix factorization and learning with consistency is proposed by introducing class information in the process of non-negative matrix factorization, which introducing limited labeled examples classification information as constraints in the process of dimensionality reduction, enhanced data dimensionality reduction feature representation capability. Finally, it be introduced the dependencies between classes, and it is proposed about semi-supervised learning algorithm based on the class graph of dimensionality reduction. The algorithm respectively between the examples and examples and between classes and classes create graphs to construct a graph regularizer based on frame, and then it is obtained unlabel samples of labels by solving the Sylvester equation. Experimental results on public data datasets show that the proposed algorithm are both make use of limited labeled examples data and about datasets dimension reduction, and not only can effectively reduce the dimension of the data, but also can improve the classifier generalization ability.

Keywords/Search Tags:

machine learning, the curse of dimensionality, dimensionality reduction, semi-supervised learning, Sylvester equation

Related items

1	Application Of Dimensionality Reduction Via Local Smoothness Assumption
2	Semiparametric Inference For Expectile Regression In The Semi-supervised Learning Framework
3	Research On The Algorithm Of Fisher Linear Discriminant Analysis
4	Research On Prestack Seismic Waveform Classification Method Based On Semi-Supervision
5	Research On Key Issues And Algorithms Of Quantum Machine Learning
6	The Study On Classification And Prediction For High Dimensionality Biological Data
7	Key Techniques Research On Quantum Machine Learning
8	Research On Dimensionality Reduction And Clustering Method Of ScRNA-seq Data Based On Deep Learning
9	Research On Dimensionality Reduction Based On Uniform Manifold Approximation And Projection
10	Graph Optimization Framework Based Dimensionality Reduction Approaches And Their Applications