Font Size: a A A

For The Application Of Bayesian Algorithm In Spam Filtering

Posted on:2013-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2248330371464963Subject:Computer technology
Abstract/Summary:PDF Full Text Request
E-mail is one of the most popular network applications, network communication has become an important way. But spam (spam) is being abused and cause great harm on the Internet. Initially, spam was mainly a number of unsolicited commercial advertising e-mail, and now more about sex, politics, increasing spam, junk mail and even reached the total amount of 40%, and still growing the trend. On the other hand, spam has become a computer virus, new, fast transmission.Spam can bring a great impact for the majority of users on the Internet, this impact is not just that people take time to deal with spam, and other system resources, it also brings a lot of security issues. Spam takes up a lot of network resources, it is obvious. Some mail servers because of poor security, spam relay station as to be warned, blocked IP and other incidents have occurred, making the network resources consumed by a large number of normal business operations become slow. With the development of international anti-spam, blacklist sharing between organizations, making the innocent server is a wider screen, which undoubtedly will give normal users to use cause serious problems. Spam and hacker attacks, viruses, etc. more and more closely combined with the evolution of spam with malicious code or monitoring software to support the spam has been significantly increased. More deceptive e-mail virus, so many companies suffer, for the ordinary user, it is difficult to make the right judgments, but the loss is very straightforward.This paper describes the different data mining methods, a comprehensive comparison of different classification methods, consider the impact of various factors, spam filtering, a Bayesian classifier to form the message text data filtering model. Bayesian classification by experiments using cross-validation method to collect some of the message text for the phrase, by training on the test set to determine the classification of text messages, and ultimately resulted in a more efficient experimental data. Meanwhile, the combination of data mining tools that weka, constantly on the Bayesian classifier can be dynamically adjusted to achieve optimal classification results.
Keywords/Search Tags:Spare filtering, Naive Bayes, Classify text
PDF Full Text Request
Related items