Font Size: a A A

Research On Cost-sensitive Model For Boundary Region In Three-way Decisions Model

Posted on:2018-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:G WangFull Text:PDF
GTID:2348330515483865Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The three-way decisions theory is constructed based on the notions of the acceptance,rejection and non-commitment.On the basis of two-way decisions,the three-way decisions theory adds an additional decision option of non-commitment.It means that we can take the decision of non-commitment instead of making a decision immediately in the case of lacking of information.The decision of non-commitment is also called as deferred decision.In the study of rough set and decision-theoretic rough set,Yao put forward the three-way decisions theory which provides a reasonable semantic interpretation for the three domains of rough set model.The positive region,negative region and boundary region in rough set stand for acceptance,rejection and non-commitment respectively.The three-way decisions model describes the thinking mode of human being in dealing with practical decision-making problems very well.Nowadays,the three-way decisions theory is widely applied in many subjects and fields,such as medical diagnosis,investment decision-making,spam classification and etc.The Decision-Theoretic Rough Set(DTRS)is cost-sensitive when dealing with classification problems,which gets the thresholds α and β directly by loss functions.However,DTRS doesn’t make any further processing on boundary region.The Three-way Decisions Model based on Constructive Covering algorithm(CCTDM)introduced Constructive Covering Algorithm into three-way decisions theory,opened a new way for three-way decisions theory.It can generate three regions automatically without considering any parameters.Besides,CCTDM provides three methods to deal with boundary regions,but none of them is cost-sensitive.In recent years,with the development of data mining and machine learning,people have increasingly realized that classification problem is often cost-sensitive.How to process the boundary region has become a burning issue in the field of three-way decisions theory.So,in this dissertation,we have proposed two cost-sensitive models to process boundary region.The purpose is to reduce the classification loss and the number of high cost sample of misclassification as much as possible when dealing with boundary region.The main work of this dissertation is as follows:1.First of all,we simply described the development of the three-way decisions theory and did some analysis of the existing problems.Then,we reviewed the theoretical knowledge of the three-way decisions model based on DTRS and the three-way decisions model based on CCA in detail.At last,we have put forward two cost-sensitive models to deal with boundary region effectively,which are the Cost-sensitive three-way decisions Model for Processing Boundary region and the Cost-sensitive Three-way decisions model to process boundary region based on K Nearest neighbor.2.The Cost-sensitive three-way decisions Model for Processing Boundary region(CPBM)reduce classification loss by adjusting the boundary distance between samples of boundary region and covers.However,in the three-way decisions model based on CCA,the Nearest to the Boundary Principle that processes the sample just depending on the minimum boundary distance between the sample and covers is not cost-sensitive.Compared with the Nearest to the Boundary Principle,CPBM can effectively improve the recall rate of high cost sample up to twenty percent,which leads the reduction of classification loss.3.The Cost-sensitive Three-way decisions model to process boundary region based on K Nearest neighbor(CTK)combines K Nearest Neighbor method and cost-sensitive classification.By quantifying different decision losses,CTK can reduce classification loss with choosing the decision of minimum decision loss.According to the obtained optimal value of K,CTK can improve the accuracy of the classification by taking full advantage of the class information of the K nearest covers.So,compared with the methods that are not cost-sensitive,CTK not only reduces the classification loss effectively but also classification error rate on some data sets.
Keywords/Search Tags:Three-way decisions, Process Boundary Region, Constructive Covering Algorithm, Cost-sensitive Classification, K Nearest Neighbor Algorithm
PDF Full Text Request
Related items