Font Size: a A A

Feature Selection Based On Feature Extraction

Posted on:2012-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:C L YuFull Text:PDF
GTID:2218330338962966Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In many real world problems, such as pattern recognition and data mining, dimensionality reduction is an essential step before analysis of the data. Feature selection and feature extraction are two commonly adopted approaches to this issue. Feature selection refers to selecting features in the measurement space, and the features provided is a subset of the original features, while feature extraction technique select features in a transformed space. These feature extraction methods ?nd a mapping between the original feature-space to a lower dimensional feature space.Now, most scholar do the research of feature selection and feature extraction separately, here the author combines them and develops two algorithm, which is Feature selection based on principal component analysis(PCA) and feature selection based on linear discriminant analysis(LDA).Feature selection based on PCA is based on that, the axes of the lower-dimensional space, i.e., principal components, are a set of new variables carrying no clear physical meanings. Thus, interpretation of results obtained in the lower-dimensional PCA space and data acquisition for test samples still involve all of the original measurements. To select original features for identifying critical variables of principle components, the author develop a new method with k-nearest neighbor clustering procedure and three new similarity measures to link the physically meaningless principal components back to a subset of original measurements. Experiments are conducted on UCI data sets and face database to show their superiorities.Feature selection for high-dimensional data based on LDA is a new hierarchical filter model that combines the Fisher criterion for single feature and other feature selection technique, which remove irrelevant features and redundant features separately. Using the Fisher criterion as a filter remove features that is irrelevant and noisy. Due to the analysis about feature correlation in high dimensional data, the Redundancy Measurement and a Fast Correlation Based Filter (FCBF) algorithm can be a filter to remove redundant features. To confirm the model can largely reduce the feature dimensionality with little loss of classification accuracy. The author conduct the experiment on four publicly available datasets and face data sets with di?erent poses, expressions, backgrounds and occlusions for gender classi?cation. The results show the model can largely reduce the feature dimensionality with little loss of classification accuracy.
Keywords/Search Tags:Feature selection, Feature extraction, Principal Component Analysis, Linear Discriminant Analysis
PDF Full Text Request
Related items