Font Size: a A A

Research On Multi-Source Cascade Popularity Prediction Based On Rich Information Heterogeneous Graph

Posted on:2024-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z WuFull Text:PDF
GTID:2568306941964479Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet,online social networks have emerged and become an important channel for information diffusion.Most existing online social networking platforms allow users to tag information with topics,enabling multiple users to spontaneously post and spread the same topics,ultimately these topics forming a multi-source information.In this paper we investigate the popularity prediction of multi-source information.In general,the popularity of information measures the number of users involved in the information diffusion,and the predictive models of popularity can be applied in areas such as advertising and marketing,opinion monitoring,risk control and epidemiological researches.Existing methods suffer from three main challenges:insufficient capacity to handle multi-source information,unreasonable processing of spatio-temporal data and lack of effective exploitation of multi-modal data.This research draws on ideas and methods from the fields of graph representation learning,spatio-temporal feature extraction and natural language processing to solve the main challenges of existing methods in a stepwise manner.Firstly,this research constructs userrepost spatio-temporal heterogeneous graphs that can simultaneously express spatio-temporal features and handle the disconnection of multi-source cascades,and then integrates recurrent neural networks between the heterogeneous graph neural network layers to enhance the coupling of spatio-temporal feature learning.After that,this research further proposes a soft partitioning algorithm for time intervals and a weighted dynamic sampling algorithm for user following relationships to optimize the processing of spatio-temporal data.A two-stage prediction output is also designed to solve the problem of unbalanced distribution of popularity labels,moreover,the learning capability of the model is improved by a multi-task optimisation strategy.In addition,this research further standardises the computational process of information popularity prediction,designs a generic framework for the prediction with different candidate strategies and modules.Finally,this research proposes a method to construct a weibo-repost heterogeneous graph with rich information for the complex information diffusion structure(e.g.the same user participates in information diffusion multiple times)and the multimodal data(such as content text,user tags,user authentication and geographic location).Furthermore,this research designs feature generation schemes for different modal data.In addition,the Neo4j graph database is introduced to optimise the data storage and processing,and decouples the model design from the data processing.This research conducts extensive experiments on real-world social network datasets from Sina Weibo and Twitter to evaluate the performance of the above methods.In summary,this research addresses the three main challenges of existing methods,and proposes optimisation methods for the framework design,the model training and optimisation,and the data storage and processing.The method proposed in this paper is the first modular popularity prediction framework that supports multi-source information processing and multi-modal feature exploitation,with excellent prediction accuracy,extensibility and research value.
Keywords/Search Tags:Deep Learning, Online Social Networks, Popularity Prediction, Multi-source Cascade, Heterogeneous Graph, Rich Information
PDF Full Text Request
Related items