| With the development of Internet, E-mail has already been favored gradually bypeople for its convenience. Further, a lot of important letters will be conveyed byE-mail. Unfortunately,many Spams that spread on network at the same time notonly fill up mail server storage space,but also make users spend much time onremoving them. As a result,it is significant to explore an automated E-mail filter.There are two major methods on automated filtering mail: based on rule andbased on probability. The filter based on rule often confine to the two-dimentionalspace, lack knowledge of the credibility, need users to set up and alter the rule offiltering by hand, so it is not good. At present, the majority filter based onprobability commonly use Na?ve Bayes algorithm (NB) or K-Nearest Neighbor(KNN), which are based on Empirical Risk Minimization. So, their popularizationperformance isn't very excellent.Support Vector Machines (SVM) is a kind of novel machine learning method. Itcan solve small-sample learning problems better by using Empirical RiskMinimization in place of Structural Risk Minimination. Moreover, this theory canchange the problem in non-linear space to that in the linear space in order to reducethe algorithm complexity by using the kernel function idea. SVM have become thehotspot of machine learning because of their excellent learning performance. In thisarticle, SVM has been applied to E-mail-filter. The result of experimentation showsthat the result of filtering is better.Firstly, this article analyses the current situation and harm of Spam. It introducesthe anti-Spam organization and general knowledge, and discusses deeply to variouskinds of existing filtering technology. Secondly, the characteristic expression ofmail and the theories of SVM are introduced. Moreover, this article discusses thealgorithm of E-mail-filtering based on SVM, and compares SVM with NB and KNNby experiment, which shows that it is obviously superior to the others. Finally, thisarticle designs and realizes tentatively an E-mail filtering system based on supportvector machine. This system that lies in mail client end can study the mail sample... |