Research And Application Of Penalty Likelihood Estimation Based On Two-part Model

Posted on:2021-12-11

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Zhang

Full Text:PDF

GTID:2517306113453464

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

In statistics,the potential model structure and variable selection problems of zero-inflated data are often studied by means of zero-inflated model.However,in most cases,the non-zero part of the response variable is quantitative data,and the simple zero expansion model cannot describe the model structure of such data.The corresponding parameter estimation method is no longer applicable.In view of this,scholars proposed a two-part model to deal with zero expansion semi-continuous data.In this paper,the maximum likelihood estimation method of penalty function is introduced into the two-part model to study the problem of variable selection.The main contents and conclusions of this paper are as follows:1.The principle of the two-part model based on the maximum likelihood estimation method of penalty function is expounded,and the existing penalty estimation method is systematically studied with the help of Logit-gamma two-part model,and these methods are applied to the data of family medical expenses to analyze its influencing factors.Numerical simulation and case analysis show that the maximum likelihood estimation method based on the Minimax concave penalty function is better in terms of stability and model interpretability.2.On the basis of Combined Punishment(CP)proposed by Wang et al.,a New Method of Punishment likelihood estimation,NCPM(New Combined Punishment Method),which is efficient,convenient and easy to implement,is proposed based on the good performance of₂ among highly correlated explanatory variables.It can be seen from the theoretical proof that under certain regularization conditions,this method has the"Oracle"property of variable selection.Simulation results show that this method is superior to other methods to a certain extent when?9)or explanatory variables are strongly correlated.3.Regarding the variable selection of the two-part models of combined penalty maximum likelihood estimation,this paper adopts LLA-CGD(Local Linear Approximation and Coordinate Gradient Descent)algorithm.The algorithm overcomes the nonlinear problem of objective function and realizes the feasibility of calculation.The numerical simulation shows that the algorithm is effective and provides a new idea for the variable selection of the two models.4.It is proved that the data of precipitation greater than zero in taiyuan city followed the Gamma distribution,and the Logit-gamma two-part model was constructed.The NCPM method is applied to the model to analyze the factors affecting precipitation in taiyuan city.According to the analysis,whether the precipitation is mainly affected by dew point temperature,wind speed,sunshine duration,air relative humidity,PM2.5 and PM10 concentration,etc.When precipitation occurs,the amount of precipitation is more likely to be affected by the daily average temperature,wind speed,sunshine duration,air relative humidity,PM2.5 and PM10 concentrations,AQI,etc.Finally,compared with the Elastic net method,it is found that the model based on NCPM is more concise and more interpretable.

Keywords/Search Tags:

Combined punishment, Two-part model, LLA-CGD algorithm, Variable selection, Precipitation

PDF Full Text Request

Related items

1	Theoretical Research And Empirical Analysis Of Variable Selection In Spatial Autoregressive Model
2	Variable Selection For Generalized Linear Model With Highly Correlated Predictors
3	Research On Improved Linear And Nonlinear Variable Selection Methods
4	Estimation And Variable Selection For Function-on-scalar Linear Regression Model
5	Study On The Parameter Estimation And Robust Variable Selection For Linear Model
6	A Bayesian Variable Selection Method With Spike-and-Slab Strategy
7	Variable Selection Of Complex Data Joint Model Based On Improved Lasso Method
8	Credit Risk Scoring Model Based On Variable Selection
9	Variable Selection For Partially Linear Spatial Autoregressive Models With A Diverging Number Of Parameters
10	Research On Variable Selection In High Dimensional Data