Font Size: a A A

Real time text analysis on Internet Relay Chat conversations

Posted on:2013-06-26Degree:M.SType:Thesis
University:Purdue UniversityCandidate:Michels, Marvin OFull Text:PDF
GTID:2458390008481139Subject:Information Technology
Abstract/Summary:PDF Full Text Request
Internet Relay Chat (IRC) has been and is still being used for a number of legal and illegal activities. Investigations dealing with IRC tend to be arduous and require a vast amount of man hours for the constant monitoring needed, whether it is from law enforcement or just a normal user surfing through the channels. This research looked at developing the IRC Data Gathering Tool (IRCDGT), which facilitated real-time analysis of IRC chat messages as well as real-time updates to the investigator. This is intended to help reduce the number of man-house needed in front of a computer for an investigation. A crawler was developed for IRC that goes through a list of channels and reports on what is being discussed in those channels. Normal keyword analysis statistically outperforms keyword & POST analysis in terms of recall while there is no significant difference between basic keyword analysis and keyword & POST analysis in terms of precision. Topic analysis was performed in near-real time to enhance the keyword analysis. Lastly, natural language processing seems to have issues with dealing with the language of the Internet subculture.
Keywords/Search Tags:IRC, Chat, Keyword analysis
PDF Full Text Request
Related items