With the development of big data and artificial intelligence technologies,Gaussian mixture model(GMM)has attracted extensive attention in the statistics and computer industries due to its ability that can well depict the spatial distribution and characteristics of many data sets.It is widely used in machine learning which contains supervised,unsupervised and semi-supervised learning,and can be used in many fields.Expectation-maximization(EM)algorithm is a classical iterative method for the parameters estimation in the GMM,which has good properties and the characteris-tic of practical.But EM has some drawbacks,e.g.,the initial value sensitivity,the uncertainty of the global convergence and the requirement of latent variables assignment in advance.Therefore,based on the GMM,we improved the EM algorithm in terms of unsupervised and semi-supervised learning in this paper,as shown below.First,for the unsupervised learning,we improved the FCM algorithm from three aspects,i.e.,determining initial cluster centers by the density,speeding up the algorithm convergence rate by adding a penalty term and determining the optimal number of clusters by using the Xie-Beni index.Thus,the IFCM algorithm is proposed.Then,based on the unsupervised GMM,the IFCM algorithm is utilized to initialize parameters and assign latent variables for the EM iterative pro-cess.And then,the IFCM-EM algorithm is obtained and analyzed.Numerical simulation and empirical analysis are conducted to evaluate the performance of the proposed algorithm.The re-sults demonstrate that the parameter evaluation is performed well and fewer iterations are required compare with the state of arts.Meanwhile,better clustering effect is obtained in the corresponding unsupervised Gaussian mixture model.Second,based on the GMM,the ME-ME algorithm is proposed in terms of semi-supervised learning.In this algorithm,the principle of maximum entropy is adopted to change the estimation method of the posterior probability of the latent variables to solve the problem that the posteriori probability estimation is overly dependent on the estimated parameters.Then we proves that the essence of ME-EM algorithm is deterministic annealing algorithm.Under certain conditions,it can converge to the global optimum.While the sampled data can not reflect the real distribution of data,the parameter estimation performance and algorithm robustness are well improved generally compared with the EM algorithm.Simulation results and empirical analysis demonstrate the su-periority of the proposed algorithm and the promotion of clustering effect of the semi-supervised Gaussian mixture model. |