Font Size: a A A

Application Of Community Identification And Topic Detection Based On Micro Blog

Posted on:2016-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2348330488477272Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet has greatly promoted the development of social network information, in order to micro-blog as the representative of the information network has become an important part of people's life and production. From the complex information network that contains the network community can guide users to find the interesting information, but also can help businesses find customers, provide more accurate and personalized recommendation for users, but also convenient for network service providers effectively the organization structure of the website. Community discovery attempts to identify the inherent network community structure, but because of the complexity of the social network itself, artificial discover some communities of great difficulty and low efficiency, therefore, the research has a high theoretical value and practical value for the community found micro-blog.The main content of this paper is based on the discovery of the relationship between the community of users. Introduced the domestic and foreign research background, social networking and other relevant definitions and lists of microblogging and various viral propagation characteristics, the study discusses the work of research ideas, a prerequisite. Related data acquisition, Google from the comparison, the difference between Baidu reptiles start, issued no cookie crawl content solutions. Starting from Newman fast algorithm is proposed based on the discovery of scattering like microblogging community-based approaches.Nexr,this paper studies the micro blog community discovery algorithm based on the scattering. It is found that there are a large number of scattered structure in the micro blog network. By determining the center node with the "representative", the network can be divided into the community based on these nodes.And then,this article describes the process flow microblogging topic detection and tracking system. Details of the post-processing a large number of micro-blog topic information is based on Single-Pass clustering algorithm to quickly sort algorithms automatically detect microblogging topic. Then introduce the vector tracking algorithm based on adaptive query topic, it can achieve a de-noising topic tracking program.Throughout the existing community discovery algorithm, take people long to make up our own short, integrated micro-blog community is different from other social networks can be a one-way interesting features, paper focuses on community-based customer relationship division method tightness of a study.The test results show that the application time-based technology to collect the breadth of information about the network priority analysis, due to changes in the micro-Bo, in order to avoid collecting the information collected to be eliminated outdated information, and get the page to add time analyzer information, to be determined whether the contents are collected at time of pages to determine whether the earlier preset time to complete the process. This problem only by the breadth of such collection page, rather than a collection depth in order to collect information on the high-efficiency, and to ensure that the information is covered within the collection.
Keywords/Search Tags:Data collection, Community detection, Topic detection, Social networking
PDF Full Text Request
Related items