Font Size: a A A

Measuring Uncertainty Of Rough Sets And Its Application In Text Classification

Posted on:2018-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:S H YangFull Text:PDF
GTID:2348330569986441Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,human society will produce large amounts of data information every day.With such huge data information,an effective approach is urgently needed in dealing with a large number of data.Rough set theory as a kind of mathematical tool which can effectively deal with the data and knowledge acquisition was born naturally.Because the classical rough set model is strict in its requirement,the rough set is restricted in the practical application.So the researchers proposed some extensions of rough set model such as probabilistic rough set model,variable precision rough set model,decision theoretic rough set model and so on,and these rough set extensions are used to improve the fault tolerance of rough set theory in practical application.In recent years the research of the extensions of rough set model greatly enrich the theory of rough set,but there are still some problems which are worthy of further research.How to measure the uncertainty of the extensions of rough set model,and what is the change rules of uncertainty with changing approximation spaces,and in text categorization,how to get a kind of algorithm with better classification accuracy and efficiency of classification.Aiming at the above problems,this paper has carried out the research work of the following aspects:Firstly,the author has carried on the earnest analysis in probabilistic rough set model,variable precision rough set model and decision theoretic rough set model.An uncertainty measurement formula from three regions(positive region,negative region and boundary region)are proposes,and three incremental information are defined.Then we discuss the change rules of uncertainty with changing approximation spaces.Secondly,according to the research of the measurement uncertainty,a kind of uncertainty measurement formula of approximation set of rough set.Then we discuss the change rules of the uncertainty of approximation set of rough set with changing the threshold ?,and an example is given to illustrate the validity of the conclusion.Finally,the approximation set of rough set is applied to text classification in practical application.However,KNN classification algorithm has an obvious defect that the classification efficiency reduces significantly when the number of texts is large.In this paper,a kind of KNN text classification algorithm based on approximation set of rough set is proposed.The experimental results show that the proposed algorithm is more effective than the traditional KNN with same classification accuracy.These results further promote the rough set theory in the practical application.
Keywords/Search Tags:rough set, uncertainty, text classification, approximation set, granular computing
PDF Full Text Request
Related items