Font Size: a A A

Research On Several Variable Selection Methods And Their Applications In Longitudinal Data

Posted on:2021-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:G X ChenFull Text:PDF
GTID:2480306107479864Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In statistics,more and more people begin to pay attention to the longitudinal data and the longitudinal data model,which has a very wide range of applications,especially in the field of medical and sociological research,and has brought great convenience to people's lives.Longitudinal data combines the characteristics of cross-section data and time series data.Longitudinal data can reflect the changes of samples with time and the differences between and within samples,thus increasing the amount of information.Therefore,it is necessary to study the longitudinal data model in statistics.The generalized estimation equation(GEE)can be used to fit the corresponding models of the dependent variable with binomial distribution,normal distribution,Gaussian distribution and other distributions,so as to solve the problem of dependent variable correlation in the longitudinal data research,so as to obtain the robust parameter estimates.This paper introduces several kinds of variable selection methods in longitudinal data and their application research from three parts: theory,simulation and case analysis.In the theoretical part,the contents of GEE(generalized estimation equation)and work-related matrix of longitudinal data are firstly introduced,and the advantages of generalized estimation equation in longitudinal data research are given,as well as several criteria for variable selection.The main contents of AIC criteria,BIC criteria,QIC criteria,eaic criteria,EBIC criteria,GAIC criteria and GBIC criteria are described in the simulation We will compare the performance of these criteria in variable selection and find a better criterion for variable selection.In the simulation part,QIC,eaic,EBIC,GAIC and GBIC are extended to the longitudinal data analysis,and Ge(generalized estimation equation)is used as the framework to select the covariates.Simulation tests are carried out,and relevant response variables(Poisson response,Gauss response,discrete binomial response)are generated by using R software's data package.Then three kinds of correlation structures(ind,exc,AR(1))are considered,that is,when the response variables At the same time,the performance of these criteria in the selection of covariates.It is verified that GAIC and GBIC are better than other variable selection methods,eaic and EBIC are only effective when the work-related structure is correctly specified,and the performance of GAIC and GBIC has nothing to do with whether the work-related structure is correctly specified.Then,GAIC and GBIC,which are better than other variable criteria,are applied in case analysis.Based on the case study,this paper explores the factors that affect the personal insurance premium income.Taking the personal insurance premium income as the response variable,the covariates are urbanization rate,dependency ratio,GDP and savings deposits of urban and rural residents.Four factors are collected: Heilongjiang Province,Jilin Province and Liaoning Province(Northern China);Beijing,Tianjin,Chongqing and Shanghai(four direct Cities under jurisdiction);Fujian Province,Guangdong Province(southern China);Shaanxi Province,Hubei Province(Central China);Tibet Autonomous Region,Inner Mongolia Autonomous Region,Ningxia Hui Autonomous Region(autonomous region),these regions with regional characteristics,relevant data from 2012 to 2018,are studied in many aspects.The final conclusion is that the combination of urban and rural residents' savings deposits,GDP and urbanization rate has a significant impact on life insurance premium income,and the effect of dependency ratio on life insurance premium income is not significant.
Keywords/Search Tags:longitudinal data, generalized estimation equation, variable selection, life insurance premium income
PDF Full Text Request
Related items