| With the development of information technology,the ecosystem of digital marketing is constantly changing.Relying on computational advertising technology,the emphasis of digital marketing has shifted towards delivering targeted web-based advertisements to specific user cohorts or individuals.The prediction of Click-Through Rates(CTR)for online advertisements forms the prerequisite for actualizing digital marketing strategies,and serves as the cornerstone of computational advertising.This pivotal technique in internet companies’ advertising systems has attracted considerable attention in both academic research and industrial applications.Addressing the challenge of predicting the CTR of online advertisements,numerous scholars have proposed an array of machine learning models.However,a common shortfall is evident:most research efforts fail to model interests across varying temporal spans during the prediction of ad CTR.Moreover,user historical behavior,encompassing activities such as "purchasing","adding to cart",and "browsing",is usually processed without distinguishing or utilizing different behavior types,resulting in a missed opportunity to excavate the intensity of interests underlying diverse user actions.Therefore,a novel model is proposed,namely the Multi-time Scale Deep Interest Network(MSDIN),grounded in the paradigm of multi-temporal scale user interest extraction.This model aspires to further unearth the latent interests encapsulated within user behaviors,optimize the congruity between advertisements and users,and ultimately,actualize precision marketing.The core contributions of this thesis are as follows:(1)Based on the pre-processing and related statistical analysis of Taobao website advertising dataset,data is prepared for the subsequent model building,and a valuable reference for the use of user historical behavior data is provided.The missing and abnormal records in the dataset are identified and processed,and continuous features are standardized to prepare the data for model building.Then,based on statistical analysis of the data,the features were screened.SMOTE oversampling is used to deal with the imbalance of positive and negative samples in the data.Through the analysis of user behavior logs,it was found that there are differences in interest concentrations behind different types of behavior,which should be distinguished in the model.(2)In response to the shortcomings of current research and the implicit characteristics of user behavior,a new model is proposed based on user behavior information extraction.There are three main improvements to the model:firstly,modeling user interests at multiple time scales.Due to the implicit inherent interests of users in their historical behavior,as well as short-term interests influenced by accidental events,the model divides user historical behavior features into short-term and long-term behavior sequences in the time dimension to extract users’ recent and long-term inherent interests.The second is to refine the characteristics of behavioral sequences,with the model fine-grained long-term behavioral sequences,distinguishing the impact of different types of behaviors on click through rates.For each behavior sequence,coarse-grained brand IDs and category IDs are used instead of product IDs to express user historical behavior,learning user preferences for advertising types and brands,and presenting users with more diverse advertisements.The third is to combine gating loop units and attention mechanisms to explore long-term and short-term interests.For the short-term behavior sequence,Gate Recurrent Unit(GRU)is used to extract the user’s most recent interest state.For long-term behavior sequences,the encoder structure in Transformer was introduced,and the contribution of different types of historical behaviors of users to ad click through rate was extracted using a fusion attention mechanism.(3)In this thesis,comparative and ablation experiments are conducted on the preprocessed Taobao website advertising dataset to validate the effectiveness of the proposed model.In the comparative experiments,the model demonstrates significant advantages when juxtaposed with DIN and DIEN models,which also model user behavior in sequence.In the ablation experiment,the effectiveness of the model structure is verified by comparison with a version of itself with the improved structure removed.Finally,based on the full text thesis,suggestions in ad CTR prediction are provided for internet enterprises,and future research directions are prospected.The method described in this thesis can not only solve practical problems in advertising scenarios,but also be widely applied in multiple related fields,such as information recommendation on content sharing platforms and information push on news platforms. |