Real time text analysis on Internet Relay Chat conversations |
| Posted on:2013-06-26 | Degree:M.S | Type:Thesis |
| University:Purdue University | Candidate:Michels, Marvin O | Full Text:PDF |
| GTID:2458390008481139 | Subject:Information Technology |
| Abstract/Summary: | PDF Full Text Request |
| Internet Relay Chat (IRC) has been and is still being used for a number of legal and illegal activities. Investigations dealing with IRC tend to be arduous and require a vast amount of man hours for the constant monitoring needed, whether it is from law enforcement or just a normal user surfing through the channels. This research looked at developing the IRC Data Gathering Tool (IRCDGT), which facilitated real-time analysis of IRC chat messages as well as real-time updates to the investigator. This is intended to help reduce the number of man-house needed in front of a computer for an investigation. A crawler was developed for IRC that goes through a list of channels and reports on what is being discussed in those channels. Normal keyword analysis statistically outperforms keyword & POST analysis in terms of recall while there is no significant difference between basic keyword analysis and keyword & POST analysis in terms of precision. Topic analysis was performed in near-real time to enhance the keyword analysis. Lastly, natural language processing seems to have issues with dealing with the language of the Internet subculture. |
| Keywords/Search Tags: | IRC, Chat, Keyword analysis |
PDF Full Text Request |
Related items |