Research On Feature Selection Algorithm Based On Data Similarity

Posted on:2019-01-05

Degree:Master

Type:Thesis

Country:China

Candidate:Q J Tuo

Full Text:PDF

GTID:2428330545485540

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

In the internet era,the growing data present the characteristics of a large number of samples,high feature dimensions,and complex class structure.Feature selection can extract useful information from massive complex data and has become a hot topic in machine learning and data mining.In this dissertation,we propose three feature selection algorithms by exploring the connection among data from three perspectives of data samples,features,and classes,respectively.Its main content is reflected in the following three aspects:1.From the view of data samples,we propose a feature selection algorithm based on double samples similarity.First,the similarity matrix of data samples is structured by pairwise-distance of samples and reconstruction coefficient of the nearest neighbor samples,and the low dimensional space is constructed.Then,the norm is introduced to the low dimensional space and the feature weight matrix is obtained.Finally,we define the evaluation indicator to measure the features importance to select the optimal feature subset.2.From the view of data features,we propose a feature selection algorithm based on the similarity of reconfiguration features.First,we use the method of feature reconstruction to obtain feature similarity matrix,and the original sample space is transformed on the basis of it.Then,the sample space after the transformation is fitted to the label space under the condition of minimum empirical error.Finally,we optimize and update the feature weight matrix and use it to realize feature selection.3.From the view of data classes,we propose a feature selection algorithm based on the similarity of the nearest neighbor classes.First,to obtain the class similarity matrix,we use the parent-child relationship between classes to model hierarchical structure among the nearest neighbor classes.Then,we use the class similarity matrix to get relevant information of the nearest neighbor classes,and it can update the parameter of the current class.Finally,the feature weight matrix is obtained to select the best feature subset.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Unsupervised Feature Selection Based On Sparse Regression
2	Unsupervised Clustering Algorithm Based On Dimension Reduction
3	Application Of Lp Norm Regularized Regressions In Classification Problems
4	The Research And Application Of Clustering Feature Selection Methods
5	Similarity Matrix And Spectral Clustering
6	Clustering Algorithm Based On Robust Non-negative Matrix Factorization
7	Research On Key Technologies Of Semi-Supervised Feature Selection
8	A Study On Algorithms Of Matrix Recovery And Their Applications Based On The Truncated Norm
9	A Study On Similarity Measures For Feature Selection Method OFFSS
10	Face Recognition Via Fractional Matrix Norm