Font Size: a A A

Imputation Methods Of Missing Values Based On Compositional Data

Posted on:2018-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChengFull Text:PDF
GTID:2310330521951762Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the advance of the era of big data,missing data is a common problem in the research of experiment and investigation.There are many reasons lead to missing data,such as no response in the survey and mistakes in the process of data collection.The lack of data will affect the quality of statistics,increase the complexity of the data analysis,cause unrealizable results and decrease the efficiency of the whole statistical program.Compositional data is a kind of complex multivariate data which is widely used in many fields such as geology society and economy and mainly used to study the components of a whole.Besides,the properties of compositional data lead to that traditional imputation methods may get undesirable result if they are directly used in this type of data.As a result,this paper proposes two methods based on the modified kernel functions and Random Forest for missing values in compositional data and verifies the accuracy of this method through examples and simulation experiments.This paper is divided into five chapters:Chapter 1:introduction of background of the topic and the present situation of the methods of dealing with missing data;Chapter 2:the definition and operations of compositional data and review the existing methods of dealing with missing values in compositional data;Chapter 3:because of estimating the parameter in simplex space is very difficult,this chapter proposes a new method based on the modified kernel functions for missing values in compositional data and verifies the accuracy of this method through examples and simulation experiments;Chapter 4:this chapter proposes a new method based on Random Forest for missing values in compositional data and verifies the accuracy of this method through examples and simulation experiments;Chapter 5:summarizes the general situation of research work and results of this paper,put forward the deficiencies and to solve the problem.
Keywords/Search Tags:Compositional data, Missing data, Gauss Kernel function, Sigmoid Kernel Function, Random Forest
PDF Full Text Request
Related items