Font Size: a A A

A Click-Through Rate Prediction For DSP

Posted on:2019-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:P C HeFull Text:PDF
GTID:2370330590492260Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As the pattern of programmatic purchase becomes increasingly clear and demand is growing day by day,the Demand-Side-Platform has developed rapidly.In recent years,the market scale has reached billions dollars per year.As one of the core technologies of the Demand-Side-Platform,click-through rate?CTR?prediction plays a key role.Based on the specific scenarios and requirements of DSP advertisement CTR,this paper analyzes a lot of relevant literature and makes a deep analysis of the CTR model.With study of Google,Microsoft,Facebook and other large companies in the ad click rate prediction problem,refere to the model proposed by Facebook for Gradient Boosting Decision Trees combining the Logistic Regression[1],the logistic Regression is replaced by Field-aware Factorization Machine and get a new prediction model:Gradient Boosting Decision Trees?GBDT?combined a Field-aware Factorization Machine?FFM?.By using Gradient Boosting Decision Trees,all continuous features and selective categorical features to generate new features,good combination of features and avoidance of over-fitting;While the Field-aware Factorization Machine model decomposes the features into multi-dimensional space so that make the model more accurate.Secondly,on the issue of huge feature space,we use Chi-square test for feature selection to filter some features that are not helpful for training,so as to reduce the feature space and improve the quality of training data.At the same time,the result of the Chi-square selection is dimensionally reduced to a level that is acceptable in a practical implementation.This paper also implements this model based on the Spark.The whole process is divided into four parts:the part of Gradient Boosting Decision Trees combines the continuous features and some categorical features to generate new features;the FFM-Pre1 is to compress the continuous features;the FFM-Pre2 is to implement the Chi-square selection and hashing trick;Finally,the features of the 3 former parts are combined to train the FFM.Most of the algorithms are based on the Spark MLlib library,such as gradient enhancement decision tree,Chi-square etc.;and the algorithm of FFM is based on the third-party extension library.Finally,by comparing with several commonly models,this model is demonstrated significant.
Keywords/Search Tags:Programmatic Buying, Computational Advertising Techniques, Real-Time Bidding, Click-Through Rate Prediction, Demand-Side Platform
PDF Full Text Request
Related items