| With the popularity and development of Internet,online-display advertising has become the mainstream business in the field of advertising.In this business,the accuracy of online Ad click-through rate prediction is directly related to the advertiser’s delivery effect and the advertisng companies’ revenue.Therefore,how to improve the accuracy of online Ad click prediction has become a popular research.Firstly,this paper based on the Alibaba display advertising click-through rates estimated multi-source data sets of TaoBao in 2017,using Logistic Regression,Factorization Machine,Gradient Boosting Decision Tree,Random Forest,the four classic finished online AD click prediction.And referring to the classical idea of putting the combined features extracted from the random lifting tree into the logistic regression model,this paper discusses whether it is necessary for the four classical models to introduce such combined features,which provides data support for the selection of the primary model in the design of the fusion model in this paper.Then starting from the two major problems of not considering user differences and the limitation of prediction results of a single prediction model in the prediction process of a single model,the fuzzy C-means clustering method in fuzzy clustering algorithm is introduced to realize user classification,and several classical models are built on each user group.And The most suitable primary model of each user group is determined by the value of AUC for each model on the corresponding user group data in the model filter prediction set,and the fusion of multiple primary models is completed by the membership degree of fuzzy clustering.Finaly,on the one hand,the results of the fusion model with a cluster number of 3 are compared with those of the classical model,and its AUC index value exceeds that of any single benchmark model,and a good result is obtained,which indicates the feasibility and effectiveness of the fusion model.On the other hand,the influence of different number of clusters on the performance of the fusion model is discussed,and it is found that the AUC of any number of clusters is higher than any single benchmark model,which also indicates the great performance of the fusion model. |