Font Size: a A A

Research And Implementation Of Community Evolution Analysis System Under Massive Data

Posted on:2024-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:H Y YangFull Text:PDF
GTID:2530306944962489Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays,network science focuses on the study of common problems among complex systems from different applications and domains and the pervasive methods to deal with them.Due to the advancement of society and the advent of the Internet information age,the scope of complex networks has expanded,and traditional community discovery techniques that rely on network memory computing have long been unable to meet the needs of such complex networks.To meet this challenge,parallelization techniques can be used to efficiently handle massive amounts of data for better community discovery.In the first key algorithm studied in this paper,a single-machine community discovery algorithm based on core node expansion is first proposed,and the algorithm filters out the core nodes in the network by calculating the node similarity between neighboring nodes in the network,uses the core nodes as the basis to calculate the community structure in the first stage,and merges the communities by studying the same nodes between different community structures,and finally calculates the final community structure.Paper proposes a distributed processing model for complex networks using Hadoop platform,which uses HDFS file format to initialize network files and a distributed database to store and read/write data,and also establishes a multi-level MapReduce data processing model to achieve more efficient network management.Finally,a MapReducebased distributed community discovery algorithm is proposed by combining the proposed stand-alone community discovery algorithm with a multi-stage distributed processing model.Through comparison experiments,the algorithm model proposed in this paper shows good accuracy and time efficiency in community discovery,and it is also able to improve the computational efficiency of the algorithm for networks with different sizes of data by changing the number of tasks in the MapReduce framework.On the other hand,the networks in today’s life are more dynamic and evolve over time,and static network analysis misses the opportunity to capture the evolving behavior in dynamic networks.Detecting the evolution of a community provides insight into the underlying behavior of the network.In this paper,a framework for detecting the evolutionary behavior of communities in dynamic networks is proposed.First,a new community matching algorithm is proposed that can track and identify similar communities over time and establish relationships for them as the basis of evolutionary behavior.Then,a community evolutionary behavior detection algorithm based on the number and importance of nodes is proposed to further analyze the evolutionary process of dynamic networks by considering the relationships and influence between nodes.This paper applies the framework to several real datasets to verify the capability and applicability of the framework.Experimental studies show that the algorithm proposed in this paper can accurately mine more community evolutionary behaviors in dynamic networks,and this paper also investigates the effects of parameters in the algorithm on the detection results of different evolutionary behaviors,so as to further improve the accuracy of the algorithm.This paper proposes a new community discovery and dynamic evolutionary behavior detection system that can efficiently handle massive amounts of data and provide researchers with no algorithmic experience through a web front-end,making it easier for them to conduct research on complex networks.Such a system can not only support two important applications,community discovery and community dynamic evolutionary behavior detection,but also assist researchers in understanding and applying these methods well.This thesis firstly introduces the research background and significance of the whole subject and expounds the research status and related technologies at home and abroad.Then,based on the business scenario,the demand analysis is carried out,the corresponding algorithm is proposed for the key demand,and the experimental process is introduced.Then it introduces the design and implementation of the whole system,as well as the deployment and testing of the system.Finally,the work of this thesis is summarized,and the future research is also prospected.
Keywords/Search Tags:distributed technologies, community discovery, time series networks, evolutionary behavior detection
PDF Full Text Request
Related items