Research On Interests Of Sina Weibo Users Based On LDA Topic Model

Posted on:2021-02-04

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Shi

Full Text:PDF

GTID:2427330602983966

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

As the development of China's mobile Internet has matured and stabilized,various social platforms have paid more attention to the diversification of content,actively seek innovations and breakthroughs in the model in order to seize each other's market share.Although Sina Weibo continues to occupy the leading position in the mobile social industry,the fierce competition in the industry has also brought great challenges to the development of Sina Weibo.The core competitiveness of Sina Weibo lies in the communicative impact brought by the leading groups of users and high-quality original contents,so this requires the platform to control users' need more accurately in the current environment.Focusing on this issue,this article has conducted relevant research on the interest and preferences of users on Sina Weibo.LDA(Latent Dirichlet Allocation)probability topic model is a three-layer probability structure model proposed by Blei et al.It can be trained to obtain the probability distribution of each document on the topic space and the probability distribution of each topic on the word space.It has the characteristics of unsupervised learning,and does not need to give some examples of languages with known annotations.LDA can be directly modeled on an unknown corpus.In many researches on Sina Weibo users' interest preferences,the modeling and training of the Weibo documents created by the user as a unit is directly obtained in this way,and the distribution of the subject terms of the documents is obtained as a description of the user's interest preference information.This article adds another inference method.Firstly,use a known corpus to supervise training and obtain an optimal model.Then use this trained optimal model to semantically mine and analyze the documents created by users as units in other unknown corpora.In this paper,this known corpus is constructed of the classification labels of popular features on Sina Weibo,which ensures that the corpus has unity in terms of the characteristics of words used before and after model inference.In addition,this article combines the experience of using Sina Weibo platform and the development concept of Sina Weibo in recent years,and puts forward the hypothesis that data liked by user in history should be added to expand the user data documents in empirical research.And through the questionnaire survey and empirical research,the assumption is proved to be reasonable in theory and effective in practice.Regarding the method of collecting data,because of the limited access mechanism of the Sina Weibo platform,this article designed and developed a crawler system for Sina Weibo under the Python programming language in order to collect Sina Weibo data for different research needs.

Keywords/Search Tags:

Users on Sina Weibo, Interest mining and analysis, Topic model, LDA, Crawler system

PDF Full Text Request

Related items

1	Topic Mining And Emotion Analysis Of Weibo Caused By Graduate Student Falling From A Building
2	Research On Weibo Users’ Attitudes Toward Homosexuality
3	Research On The Emotion Mining And Communication Of Weibo Users In The World Cup Situation
4	Research On Microblog Topic Sequential Feature Extraction Algorithm Based On LDA-WO Mixed Model
5	Research On Sina Weibo User Information Based On Two Improved Clustering Algorithm
6	Topic Mining Of Weibo Comments Based On STM And Emotional Evolution Research
7	The Change Of Prevalent Symbols Of Male Images In Sina Weibo(2012-2018)
8	Research On The Present Situation And Countermeasures Of The Value Orientation Of Sina Weibo Blogger
9	Subculture Research On Sina Weibo
10	Mining Topics Of Chinese Core Journals Of Statistics Based On LDA