Font Size: a A A

Research On Nonparametric Methods Of The Multi-index Comprehensive Evaluation And Clustering Methods With Missing Data

Posted on:2012-09-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:R J LuoFull Text:PDF
GTID:1220330395964408Subject:Plant biotechnology
Abstract/Summary:PDF Full Text Request
Multi-index comprehensive evaluation refers to making an overall and entirely assessments on observation system described by multiple attributes structure. It is an evaluation method which transfers a number of statistical index information reflecting the different attributes of target observation into dimensionless relative evaluation value through the mathematical and statistical methods and achieves the merit ranks of target observation by using them. The study on comprehensive evaluation method has always been a hot issue in the field of evaluation research. The dissertation gives a brief review of the multi-index comprehensive evaluation method. The focus is on several commonly used comprehensive evaluation methods belonging to Operations Research and other mathematical method categories, which includes analytic hierarchy process, fuzzy comprehensive evaluation, data envelopment analysis, grey comprehensive evaluation method, TOPSIS evaluation method and so on. It also makes a detailed interpretation on comprehensive evaluation method from the aspects of its definition and principle, pattern and procedure, advantages and disadvantages analysis etc. Finally, it discusses the integration of evaluation methods, existing problems and research trends.Clustering analysis is a kind of multivariate statistical analysis method on classification issue, and an important mean of data analysis. Clustering analysis is to divide a non-tag labeled data set into several sub-sets (classes) in accordance with a certain similarity. The main basis of clustering is on the similarity of objects within the class as large as possible, while the similarity between classes objects as small as possible. Clustering analysis can effectively find the data distribution feature and typical pattern hidden in the data set, which lays a good foundation for the further use of data fully and effectively. Clustering analysis has become an important technique and main method to data mining. Over the years, many scholars have made a broad and deep research on the clustering algorithm. The dissertation made an overview of five clustering algorithms, including partitioning method, hierarchical method, density-based method, grid-based method and model-based method, and introduced some classical clustering algorithms in each category.On this basis, the dissertation launched a preliminary study on two issues: difference significance test method on multi-index comprehensive evaluation results and statistical analysis of missing data clustering. The major research results include:(1) Developed a statistical hypothesis test method of multiple traits comprehensive evaluation (non-parametric rank-sum and rank-sum-difference test)At present, there are many multiple traits comprehensive evaluation methods at home and abroad, while they can only provide the different distinguishing methods, the evaluation findings showed a certain comprehensive evaluation value and the corresponding merit ordering, but they can not provide the difference significance from each evaluation object and its average level. The dissertation presents a statistical hypothesis testing method of multiple traits comprehensive evaluation (non-parametric rank-sum test). Under null hypothesis:the variety’s ranking on each trait is random, the theoretical distribution of sum of ranks (SR) was firstly derived and further used to obtain the critical values for multi-trait comprehensive evaluation in rank-sum test. A new C++class and its basic arithmetic were defined to deal with the miscount caused by the precision limitation of built-in data type in common statistical software under large number of varieties and traits. Finally, an application of the theoretical results was demonstrated using five starch viscosity traits of12glutinous maize varieties.The above rank-sum testing method for multi-trait comprehensive evaluation can test the significance of difference between evaluation objects and the average level, but it can not realize the significance of difference testing between two evaluation objects. Based on the theoretical distribution of rank-sum, the dissertation deduced the theoretical distribution of multi-trait rank-sum-difference by using combinatorial mathematics method, on which it presented the significant critical values for the rank-sum-difference testing in multi-trait comprehensive evaluation. Finally, it tested the difference significance between two evaluation objects through rank-sum-difference testing, and realized the multiple comparisons among various evaluation objects.(2) Developed a model-based dynamic clustering method with missing dataCluster analysis, as multivariate statistical method, is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in certain sense. Generally, the current clustering technique always depends on complete data. However, missing data set is often seen in practice, which brings difficulties in clustering analysis. The dissertation studied a pattern-based dynamic cluster method with missing data. It determines the reasonable alternative value and constructs a "complete" data set by using the auxiliary information of relevant variables and estimating out missing data. On this basis, with the EM algorithm iteration, the parameter estimates and alternative values of missing data will gradually converge, and judge the individual classification by corresponding Bayesian posterior probability so as to realize the dynamic clustering. Simulation studies show that missing value alternative method has good convergence, which can accurately cluster the missing data.
Keywords/Search Tags:Comprehensive evaluation, Rank-Sum Testing, Rank-Sum-DifferenceTesting, Missing data, Cluster analysis
PDF Full Text Request
Related items