| In many regression applications,we are always faced with difficulties due to nonlinear relationships,heterogeneous groups,or time series.That is,two or more regression functions are necessary to best summarize the underlying structure of the data.Unfortunately,in most cases,we don’t know a priori which subset of observations belong to which regression function,and in many applications,we even don’t know how many regression functions.The purpose of clustering regression is to divide the data into several classes,to find the approximate regression function for each type of data.Among which the cluster-wise linear regression is the easiest but widely used approach.It had drawn ones’ great attention,and was widely applied to many fields of science,such as economics,medicine,geology,etc.This paper divides the problem into two steps.First,combined with non-smooth non-convex algorithm,give a consistent estimate of the real number.This method can guarantee,under the probability structure in general,when the sample size tends to infinity,the probability of 1 to get the true number of clusters.Secondly,given the number of k use an optimization method for non-smooth non-convex function to get the global or approximately global optimal solution(if the data is sufficiently dense).This method is through the minimization of non-smooth non-convex objective function,and gradually increases the number of clusters(up to k class,stop)in each iteration,and gives a better initial point to solve the global optimization problem.The simulation of the algorithm is given to investigate the performance of the methodology.And at the end of the paper,we also apply it to the stratification of migrant workers. |