Font Size: a A A

Hidden Group Identification Based On Communication Stream In Hidden Markov Model

Posted on:2010-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:C S SunFull Text:PDF
GTID:2120330332487759Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Moden means of communication, such as e-mail, web-logs, and chatrooms, allow individuals to communicate in a number of new ways and also resulte in the vast and growing communication data. This vast communication data provides the ideal environment for groups to hide their existence and activity, the myriad of random background communications make the groups difficult to discover, and such groups are named as hidden group. The tragic events of Sep.11,2001 underline the need for a tool which can be used for detecting groups that hide their existence and functionality.In this thesis we first briefly introduce some of the existing research, and then we introduce a new algorithm for detecting hidden group that do not rely on the contents of messages and use the communication graph only. By calculating the frequency and the communication date of the nodes, the algorithm constructs larger hidden groups by building them up from smaller ones in a stream model. We made improvements to the algorithm by using Heuristic search and PageRank technology. At last the new algorithm is tested by both random graph data and Enron Eamil Dataset and the result is that the new algorithm run faster and keep the cluster quality roughly the same and often better.
Keywords/Search Tags:Hidden markov model, Hidden group, Communication stream, Enron email dataset
PDF Full Text Request
Related items