Font Size: a A A

The Research Of Feature Selection Method Based On Self-information Measures

Posted on:2020-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y HuangFull Text:PDF
GTID:2370330575486602Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Self-information was proposed by Shannon to describe the uncertainty of a signal output.The application of self-information in a decision system can be used to describe the uncertainty of decision-making,and it is an effective means to evaluate decision-making ability.Based on the fuzzy rough set model and the neighborhood rough set model,this dissertation discusses the complete theory of self-information measure.Based on this,we study the feature selection methods of two models.1.Neighborhood self-information measures based feature selection method.Neighborhood rough set is one of the mathematical tools for dealing with uncertainty in the area of artificial intelligence.Feature selection based on neighborhood rough set model is also an important research content for data mining.In neighborhood rough set,positive regions are usually used to reflect the classification ability of feature subsets.However,positive region is not a valid estimation of classification accuracy,because it only considers the lower approximation neighborhood information of the consistent decision,but ignores the upper approximation information on the decision boundary neighborhood.According to Bayesian classification rules,boundary samples also contain certain classification information.Based on this,by introducing the concept of decision self-information measure and using the upper and lower approximation concepts in neighborhood rough set theory,four kinds of uncertainty measures of decision variables are constructed,and their related properties are discussed in detail.Since the relative decision neighborhood self-information not only considers the classification information of neighborhood upper approximation and lower approximation,but also is more sensitive to feature changes,it can reflect the changes in the amount of neighborhood self-information of decision caused by slight changes in feature combinations.Based on the fourth neighborhood self-information measure model,a feature evaluation function-dependency function reflecting the classification capability is constructed,and an attribute reduction algorithm is designed.Through numerical experiments,the algorithm and some existing algorithms are analyzed and compared.2.Fuzzy self-information measures based feature selection method.Fuzzy rough set is one of the most effective methods to deal with the uncertainty of classification.However,this model only considers the information provided by the lower approximation of a decision.In reality,the uncertainty information is related to lower approximation as well as upper approximation.Therefore,in this study,we use fuzzy lower and upper approximations to construct four kinds of uncertainty measures by combining with the concept of self-information.These uncertainty measures can be used to evaluate the classification ability of attribute subsets.In addition,the relationship between these measures is discussed in detail,and it is pointed out that the fourth measure is more advantageous to feature selection in theory,because it not only considers the classification information of upper and lower approximation of decision-making,but also is more sensitive to the change of features.It is proved that these four self-information measures are generalizations of classical methods of fuzzy rough sets.Finally,a greedy feature selection algorithm is designed based on the fourth measure.The experimental results are compared with the other three algorithms to verify the effectiveness of the proposed method.
Keywords/Search Tags:Fuzzy rough set, Neighborhood rough set, Fuzzy self-information, Neighborhood self-information
PDF Full Text Request
Related items