Font Size: a A A

A Study On Qualitative Data Analysis Based On Rough Sets

Posted on:2009-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:X W FanFull Text:PDF
GTID:1100360272988802Subject:Statistics
Abstract/Summary:PDF Full Text Request
Qualitative data analysis methods have a widespread perspective of applications, but facing following problem it's necessary to developed the theory and to make it more perfect: With the progress in data acquisition and data storage, the data-bank become bigger and bigger, "population drift" and dependent data with the same distribution appear. As a mathematical tool for processing discrete data, rough sets provide a new point of view for qualitative data analysis methods. Through intensive study of the contents of qualitative data analysis and the methods of rough sets, from both the theoretic and applicative point of view, we establish in this paper the methods of qualitative data analysis based on rough sets, and discuss mainly the applications of rough sets to data description, data pre-processing, and discrimination and cluster analysis.The creative points are the following:1. The study of qualitative data analysis methods, applying rough sets andfrom the point of view of inductive deduction based on the data.2. This thesis proposes the concept of information table with divided classes S = {U/R,A,V,F} to describe qualitative data, so that the analysis of relations between variables can be transformed into analysis of the relations between equivalent classes. Further proposed is the description of association relations by means of the association information coefficient IR{X_i,X_j) , so that theshortcoming of x~2 test can be overcome.3. This thesis also proposes a reduction method of variables based on information entropy, and a method to compress association structure of data.4. At the time of the introduction of flow graphs, the thesis suggests the layer of a flow graphs should be determined according to the importance of the variables, so that things like Simpson's paradox are avoided, which are caused by improper determination of layers.5. Further suggested is the selection method of discriminative variables according to the amount of information, and discriminations of qualitative data are divided into three classes: completely deterministic discrimination, completely non-deterministi discrimination and rough discrimination.6. In the light of factor analysis, this thesis brings up a cluster method to determine the optimal family of equivalent classes, based on the analysis of association structure of variables with dimensions reduced.Our study shows that the methods proposed in this thesis can be applied to the traditional data analysis as well as to the data analysis of large scale, so that some limitations of traditional methods are got rid of, and our results add contributions to the development of qualitative data analysis.
Keywords/Search Tags:Qualitative Data Analysis, Rough Sets, Flow Graphs
PDF Full Text Request
Related items