Font Size: a A A

Research And Application Of Variable Selection In Bayesian Latent Class Analysis

Posted on:2020-07-31Degree:MasterType:Thesis
Country:ChinaCandidate:J F ChenFull Text:PDF
GTID:2417330575487544Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
With the arrival of big data era,data volume is showing explosive growth.As a precious intangible resource,the value of data has been paid more and more attention by all industries.But it's a fact that not all data has the value for us.In the actual process of solving problems,we may face a challenge that the data we care about is covered by other useless data because of the huge amount of data which may cause wrong conclusions.Therefore,before building models,we need to find out the data which is important to us,one of the main ways to solve this problem is variable selection.Latent class analysis is a statistical technique to explore the latent variables behind the manifest variables which have statistical correlation,it's widely used in the field of social statistics.Latent class model deals with category variables,which has unique advantages in the analysis and processing of questionnaire data.Variables need to be selected before building a latent class model.The traditional variable selection method is the optimal subset method,which is effective in the case of small data dimension,but it is not suitable for the high-dimensional case because of the large amount of calculation.However,most of the advanced variable selection methods is achieved with the help of penalty likelihood functions,but these methods are not suitable for categorical data in latent class analysis.So in this paper,we propose an algorithm based on Chi-square test to calculate the rate of significance for each variable and by dividing original variables into blocks to realize the variable selection problem in latent class analysis.The validation analysis of simulated data sets shows that the proposed method can effectively solve the problem of variable selection in latent class analysis.In this paper,Bayesian latent class model is used to analyze the questionnaire data of drug addicts in Yunnan.In the empirical analysis,the variable selection method based on Chi-square test that proposed in this paper is used to achieve variable selection which shows good performance.The results show that:Firstly,the variable selection method based on Chi-square test is effective in latent class analysis which greatly simplifies the construction process of latent class model.Secondly,the latent class analysis is suitable for the classification of drug addicts in Yunnan,and they can be divided into two categories according to the psychological survey results:general addiction group and severe addiction group.After the classification results obtained,the relevant groups can be treated more precisely.
Keywords/Search Tags:Variable selection, Chi-square test, Bayesian latent class analysis, Drug treatment
PDF Full Text Request
Related items