Font Size: a A A

Research On Rough Set Model Extension And Attribute Reduction Algorithm Based On Incomplete Information Systems

Posted on:2022-05-01Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiFull Text:PDF
GTID:2480306542962869Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rough set theory proposed by Z.Pawlak in 1982 is a mathematical theoretical tool to describe the incomplete and uncertain knowledge.This theory has been applied in the field of intelligent computing technology research,not only that,but also widely used in various fields such as data mining in KDD,text classification and so on.Data in many application fields are not only complex in type,but also incomplete in data information due to various reasons,which brings new challenges to the further study of rough sets.For example,decision rough set model,as an extended model,can be used to deal with various types of data.However,the incomplete and continuous data have not been studied by the proposed decision rough set model.For another example,for the unbalanced data,the attribute importance can be defined through the boundary region,and then the attribute reduction algorithm can be studied on this basis.However,this algorithm can only deal with the unbalanced data in the complete mixed information system,and is not suitable to deal with the unbalanced data with default values.To solve these two problems,this thesis improves the proposed rough set model on the basis of incomplete information system,and puts forward the corresponding attribute reduction algorithm under the new model.The main contents of this thesis are as follows:(1)Firstly,this thesis studies incomplete continuous data and proposes a new incomplete neighborhood decision rough set model.Specific process can be described as follows: firstly,incomplete neighborhood relation is introduced into incomplete continuous data,and then the traditional decision-theoretic rough sets are reconstructed by the binary relation,a model called incomplete neighborhood decision-theoretic rough sets is proposed.At the same time,based on the principle of decision cost,an attribute reduction algorithm is proposed to minimize the decision cost.Finally,experiments show that the proposed algorithm has better performance in attribute reduction.(2)This thesis also studies the unbalanced data under the incomplete hybrid information system,and proposes a kind of attribute reduction of the unbalanced data based on the incomplete hybrid information system.Specific process can be described as follows: firstly,the traditional rough set model is generalized,and a rough set model suitable for incomplete hybrid information system is proposed;Then,based on the unbalanced data,a new attribute importance is defined according to the heterogeneity of upper and lower boundary regions and class distribution.Finally,an attribute reduction algorithm for unbalanced data is designed based on the discrimination matrix.The experimental results show that the algorithm is effective and superior for attribute reduction of incomplete and unbalanced data.The innovation points of this thesis are summarized as follows:(1)An incomplete neighborhood decision rough set model is proposed,and based on the decision cost principle,the attribute reduction method of the model is defined.Firstly,the decision cost of three behaviors of the object is obtained according to the Bayesian decision rule,then the total decision cost of the whole decision class is defined under a certain attribute set,and finally,the additive search strategy of heuristic attribute reduction is adopted to reduce the attribute based on the evaluation criterion of minimum cost.(2)Since there are unbalanced data in current information systems,according to the method of dealing with unbalanced data in complete information systems,a rough set model which can apply unbalanced data to incomplete information systems is proposed.In this model,a new attribute importance is defined by combining the definition of neighborhood tolerance class and boundary region as well as non-uniform distribution class,and the attribute reduction set is calculated based on the discrimination matrix.
Keywords/Search Tags:Rough set, Incomplete neighborhood relationship, Minimum decision cost, Incomplete hybrid information system, unbalanced data, Attribute reduction
PDF Full Text Request
Related items