Font Size: a A A

Research On Machine Learning Classification Based On Dimensionality Reduction

Posted on:2018-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:T Y HuFull Text:PDF
GTID:2348330515473978Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In modern society,the progress of information technology makes the data acquisition costs continue to decrease,with the continually emerging of massive data,the data dimensions are also rising.Generally,the higher the data dimension is,the greater the computational complexity is,and the negative effects produced by the noise and redundancy in the data are more and more severe.Therefore,how to reduce the data dimension and improve the data classification accuracy have become the important issues in the machine learning field.This paper carries on research surrounding the influences produced by dimensionality reduction on machine learning classification effect.Firstly,this paper constructs the analysis architecture of data dimension reduction classification,combines the two different dimension reduction methods-non-linear dimensionality reduction locally linear embedding(LLE)and linear dimensionality reduction principal component analysis(PCA)with the five machine learning classification methods-Gradient Boosting Decision Tree(GBDT),Random Forest,Support Vector Machine(SVM),K Nearest Neighbor(KNN)and Logistic Regression.And then this paper uses the handwritten digital identification dataset to analyze the classification performance of these five classification methods on different dimension data sets by different dimensionality reduction methods.The analysis shows that,using the appropriate dimensionality reduction method for dimensionality reduction classification can effectively improve the classification accuracy;the dimensionality reduction classification effect of non-linear dimensionality reduction methods are generally better than the linear dimensionality reduction methods;different machine learning classification algorithms have significant differences in the sensitivity of dimensions;when the dimension reduction maintaining the classification accuracy,it can greatly reduce the model training time.
Keywords/Search Tags:Dimensionality Reduction, Machine Learning, Classification Problem, Handwritten Numeral Recognition
PDF Full Text Request
Related items