Font Size: a A A

Community Detection In Social Network Based On Structure-based Random Walk And Node Attributes' Information Entropy

Posted on:2017-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:X WuFull Text:PDF
GTID:2310330536953073Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the deepening of WEB2.0,a lot of user-interactive social media emerged,such as twitter,Facebook,BBS,social news,wiki and so on.As a result,a new kind of social network was produced,which was based on those sharing social media platforms mentioned above.Community detection is the foundation and the most important task in the research on complex networks.Many empirical studies shows that in a network,there are a mass of community structures behind,which was proved to be the most important topological structure feature of networks.A community structure in network is defined to be a group of density connected nodes with few connections to nodes outside of the group.Detecting communities in large-scale networks is an important task in many scientific domains which has attracted particular attentions of a great many researchers.They find that members in the same community structure are similar to each other.For example,they may share the same interests or the same goals.Socialists are able to get a better understanding on social groups from community structures.While merchants are more likely to find out the potential customer groups and then to get huge profits from personalized group recommending.In a word,community detection from social networks is very significant on both theory and practice.The problem of community detection usually can be considered from two source of information.One is the network structure which tells the relationship between nodes.And the other is node attributes which indicate some important features about a node.In the past few years,plenty of algorithms were developed on community detecting problem.But it is a pity that most of them focus on either network structure or node attributes.In return,the effect of those algorithms are not as ideal as expected.Only recently have approaches for detecting communities based on both network structure and node attributes been developed.Such kind of algorithms are challenging,as one has to combine two very different kinds of information.On the basis of previous studies,this paper puts forward a new social network community detection algorithm based on structure-based random walk and node attributes' information entropy(briefly as SARIE).SARIE algorithms can be divided into three steps.First,we adopt a random walk model to identify candidate communities from complex network based on network structure information.From each random walk,we get a node track,which contains a set of nodes.These nodes are considered from the same community with high possibility.So we take each node track as a candidate community.And then,node attributes information is used to filtrate all the candidate communities detected.While an information entropy threadhold is used as a filter.Communites that go through the filteration are called high-quality communities,which satisfy the condition that there is at least one attribute on which most of the members in the community share the same value.Finally,we merge similar high-quality communities.We developed SARIE algorithm and did an experiment on public Facebook data set.And we evaluated the quality of the output by comparing communities detected against the ground-truth communities.The measurement we use is F1 score,which is defined as the harmonic mean of precicion rate and recall rate.So far,the best result reachable on the same data set is 0.462.But SARIE achieved a 41.99% improvement in the F1 score over that.It shows that SARIE is more better on performance.We also did repeated experiments to verify its stability.You will find that SARIE is very simple and can be easily parallelized,so it is able to be applied into large-scale social network.Moreover,SARIE allows overlapping community detection and supports the cases of nodes with multi-valued attributes.Algorithm proposed here can not only find out community structures,but also be able to return community label at the same time.Which is convenient when applying the outputs into applications.
Keywords/Search Tags:Social Networks, Node Attributes, Community Detection, Information Entropy Theory, Random Walk Model
PDF Full Text Request
Related items