| In the current rapidly developing digital era,the amount of data is huge and the type is complex are the two characteristics of big data,in this context,the data often from many different groups,heterogeneity,so the mixed model has been valued by the statistical community and the computer community,and the Gaussian mixture model as the most widely available distribution model in nature,its limited mixed model in the case of a large enough amount of data can almost approximate the probability density of all real models.However,in the case of a huge amount of data,it is often accompanied by the phenomenon of too much data but not perfect,the phenomenon of data incompleteness and loss is frequent,and the typical and widely used parameter estimation method-EM algorithm,and many of its derivative algorithms,its research methods mostly focus on completely random missing and random missing data.The DAEM algorithm that introduces the deterministic annealing method on the basis of the classical EM algorithm can largely solve the problem that the original algorithm is easy to converge into the local trap due to the selection of the initial value during parameter estimation,and some scholars have studied the selection method of the critical value of annealing parameters.In this paper,the parameter estimation algorithm of the Gaussian mixture model with non-random deletion is studied,and the parameter estimation of the finite mixed distribution model with non-random deletion mechanism is improved by using the DAEM algorithm based on the combination of deterministic annealing algorithm and classical EM algorithm,adding a random step between step E and step M,generating a latent variable to substitute the missing mechanism information of the model into the Q function,and generating imputed data to form a complete data set for parameter estimation of the mixed model.Through numerical experiments,six datasets with different missing models and missing proportions and a set of real datasets(simulation censoring)are simulated,the parameter estimation accuracy of the improved algorithm and the basic algorithm is compared,and the accuracy of the model is judged by BIC and other information criteria,plus the results of simulation and example experimental show the practicability and effectiveness of our method. |