Clustering Of Longitudinal Data Based On Mixture Factor Models

Posted on:2022-01-23

Degree:Master

Type:Thesis

Country:China

Candidate:X B Yang

Full Text:PDF

GTID:2506306491460294

Subject:Statistics

Abstract/Summary:

Longitudinal data widely appears in various scientific fields and there are a large number of research methods.It is generated by repeated measurements of the response variable of each subject at several time points,and can describe the trend of individual response variables over time.For heterogeneous longitudinal data,cluster analysis is an effective tool to characterize differences between individuals.This article first introduces the use of Gaussian mixture model for cluster analysis of longitudinal data without considering covariates when there are fewer repeated measurement time points.When there are more repeated measurement time points,the dimensionality of the longitudinal data will also increase.Due to the correlation between the same individual at different time points,the dimensionality of the covariance matrix will also increase,and the number of parameters will increase sharply,which is a great challenge for cluster analysis.For this reason,we consider using hybrid The factor analysis model is used for cluster analysis.In addition,covariates are very important factors in cluster analysis,which can describe the specific situation of subgroup means.Therefore,this paper proposes the Mixture of Factor analyzers Linear Model with Common Factor Loadings(MCFLM).This model is a combination of a mixed factor analysis model with a common load matrix and a multivariate linear model.Under this model framework,high-dimensional repeated measurements are reduced to low-dimensional potential factors through the mixed factor analysis model,and the multivariate linear model depicts The relationship between the factors and covariates of each subgroup.On the other hand,this paper applies the modified Cholesky decomposition to ensure the positive definiteness of the covariance matrix,and uses the EM algorithm to estimate the parameters.Finally,the Bayesian information criterion is used to select the most appropriate In order to prove the effectiveness of this method,a numerical simulation study was carried out,and finally a set of yeast cell gene expression data was used to verify the feasibility of the method proposed in this article.

Keywords/Search Tags:

Cluster analysis, Factor analysis, High-dimensional longitudinal data, Cholesky decomposition, EM algorithm, Mixed model

Related items

1	Case Analysis Mining Based On Cluster Analysis And Decision Tree Algorithm
2	Research On Government Affairs Evaluation Data Of A Government Based On Cluster Analysis
3	Regularity Analysis Of Oil Freight Market
4	Transportation In China The Development Of Statistical Analysis
5	Applied Research Into Cluster Analysis Models Of Real Estate Investing Decision
6	Evaluation Studies On Situation Of Women In Shijiazhuang
7	Improvement Of Local Outlier Factor Algorithm And Its Application In Risk Personnel Identification In Prison
8	Establishment Of The Network Model Based On Longitudinal Data With Its Applications
9	Correlation Structure Estimation And Local-feature Detection For Multivariate Longitudinal Data
10	Star Hotel In The Analysis Of The Economic Benefits Of Difference / China The Growth Of Tourism The Total Number Of Foreigners Analysis