Font Size: a A A

Research On Copula Modeling Method And Statistical Inference Based On Vine

Posted on:2020-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:C T LiuFull Text:PDF
GTID:2370330599455872Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In practical applications,high-dimensional data has a wide range of problems.High-dimensional data has complex data structures,diverse indicator types,potential influencing factors,large amount of information,and computational difficulty.For a long time,high-dimensional data is related.In the research,the common statistical research methods are limited to the relatively simple linear regression model and the logistic modeling method,but there is little research on the high-dimensional Copula mathematical modeling method which has certain advantages in dealing with multivariables.The Copula modeling method is an important piece of theoretical knowledge in statistical analysis,and has unique advantages for multi-variable correlation analysis under complex architecture.In this paper,the two aspects of variable selection and model selection of high-dimensional data are studied.Different types of high-dimensional Copula methods are used for modeling and analysis.Specifically,the work of this paper can be roughly divided into two parts:The first part focuses on the model selection problem under the Copula modeling method of high-dimensional data.The R-vine-based method is used to construct the model.Among them,it mainly involves the selection of nodes and the selection of functions of Copula pairs.The problem is to establish a greedy algorithm for the above problems.The basic idea of the algorithm is to first determine the nodes of the tree by combining the modified Akaike information criteria,and then compare the effects of different Copula on the fittings,so as to be based on the principle of minimum spanning tree.The overall model structure is obtained,which has the advantages of relatively low computational complexity,easy operation and high flexibility in the application.Finally,the method was applied to explore the correlation of microbial communities in 11 different parts of the human body,which effectively illustrated the effectiveness and applicability of the method.The results of the study show that in the correlation study of human body parts,the closer the distance is,the stronger the correlation is,and sometimes the two parts that seem to be far away still have a high correlation.In the second part,we study the variable selection problem based on D-valid's quantile regression method.Specifically,the main consideration is that the response variable is affected by multiple index variables at the same time.Under the risk level,the influence of different indicators on the response variables,and the model is used in the study of IVF data.The results show that in order to obtain the ideal pregnancy outcome,the body mass index should be closely monitored throughout the pregnancy.For high-risk pregnant women,it is especially important to pay attention to the three indicators of age,number of eggs taken,and high-quality embryo rate,which have a greater impact on pregnancy outcomes.Through the discussion of the model selection model and the related application of the variable selection model,the model selection and variable selection method established in this paper adapt to the needs of the model selection and variable selection of most high-dimensional data in practical research.The high-dimensional Copula modeling method that deals with high-dimensional data has great advantages,such as the Copula modeling method based on quantile,and obtains the research method that is more suitable for the actual research.The related research on model selection and variable selection has Certain reference value and guiding significance.
Keywords/Search Tags:High dimensional data, Copula function, Model selection, Variable selection, Quantile regression
PDF Full Text Request
Related items