Font Size: a A A

Feature Selection Method And Its Applied Research In The Diagnosis Of The Erythema Squamous Skin Diseases

Posted on:2011-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:C X WangFull Text:PDF
GTID:2204360308967814Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the field of data mining, machine learning and pattern recognition, feature selection, as an important way of data preprocessing, is an essential part of supervised learning algorithm. In recent years, the emerging of some large scale datasets, especially in image processing or gene expressing, feature selection has become a very popular area and faced more challenge. Now it is necessary to develop a feature selection algorithm with high accuracy and efficiency to implement the reduction for high dimensional dataset. This thesis focused on feature selection research on high dimensional dataset, and proposed new feature selection algorithms to diagnose erythemato-squamous diseases. The contributions of this dissertation mainly include the following parts.Firstly, this thesis made a specific and in-depth analysis on current focusing problems in feature selection area. Then we explained the definition of feature selection, and described the difference between feature selection and feature extraction, and introduced four aspects of feature selection methods and Filter and Wrapper feature selection methods. After that, we introduced some conventional feature selection search strategies, and put forward the skills of using them.Secondly, an improved F-score feature selection method was proposed in this thesis. The Origin F-score is a simple technique which measures the discrimination of two sets of real numbers. The improved F-score we proposed can measure the discrimination of more than two sets of real numbers.Thirdly, Based on the merits and demerits of filter and wrapper feature selection model, a coupling model for feature selection was proposed in this thesis. This model combed IFSFS (Improved F-score and Sequential Forward Search) and SVM (Support Vector Machines) to finish the process of feature selection. Where the improved F-score is used as an evaluate criterion of feature selection, SFS is regarded as search method in feature selection processing, and SVM is used to evaluate the features selected via the improved F-score. And then, the dermatology data of erythemato-squamous in UCI database was used to test our proposed feature selection model. The experiment results demonstrated that the model based on IFSFS and SVM is efficient in diagnosing the erythemato-squamous diseases and achieves high classification accuracy.Finally, due to the disadvantage of SFS, where once the feature is selected, it will not be deleted from the selected features, the thesis proposed another feature selection method, based IFSFFS (Improved F-score and Sequential Floating Forward Search) and SVM. The experiment results on diagnosing erythemato-squamous diseases demonstrate the feature selection method combing IFSFFS and SVM is more efficient and achieves higher classification accuracy.
Keywords/Search Tags:feature selection, F-score, Sequential Forward Search, Sequential Floating Forward Search
PDF Full Text Request
Related items