Font Size: a A A

The Research Of Email Classification Based On SVM

Posted on:2008-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:H J ZhangFull Text:PDF
GTID:2178360215972127Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
E-mail is the most widely used Internet application, the most popular network functions. With the popularity of information technology. It has now evolved into a much more complex and rich system allow the transmission of voice, pictures, images, multimedia files and other information. Report on the accounts or even if the database can be circulated on the Internet in the form of e-mail attachments. Now, e-mail has become the life blood of many businesses and organizations. Users can manage project through e-mail discussions. Sometimes under the fast or the exchange of intercontinental e-mail messages we make our decisions. However, with an increase in the number of e-mail, how to effective classify e-mail and filter out spam bothers many people.Support Vector Machine is a new generation of learning machines based on statistical learning theory. It has many attractive features and its ability to function, learning ability and efficiency must be superior to the traditional artificial neural networks. The past 10 years, Vapnik and his colleagues put forward SVM algorithm based on statistical learning theory. In small samples, nonlinear and high-dimensional pattern recognition it has some unique advantages. The method can also be applied to other machine learning problems. Many scholars believe that it is becoming hot new field of machine learning after pattern recognition and neural network. SVM will promote a significant development on machine learning theory and technology. SLT and SVM made encouraging progress on application of the kernel in dealing with the small sample problem. It has been the best learning theory on the small sample statistical estimates and forecast study.This essay research on email classification based on support vector machines, including the following :1. About the theory of e-mail. It explained the E-mail format, followed by analysis of e-mail transmission principle, the relevant agreements and standards and elaborated a mail classification status in the domestic-oriented issues. This laid the foundation for the all-round promotion. 2. About Support Vector Machine. It discussed the idea, and its research applications and dynamic characteristics, classification.3. The definition of text classification and assessment methods. And finally a detailed discussion on the Chinese classification process: the text expression , the features extract and the training methods and classification.4 This paper describes the design and initial implementation of a mail classification system based on support vector machine. The system at the mail client will study mail samples itself, automatically download mail from mail server and mail sorting and filtering.This essay has some shortage. Through a more in-depth study on support vector machine, I wish mend algorithm to classify. Through systematic research ,feature selection methods should select the most suitable option-spam features .To improve the classification it need for extensive training samples mail collection. Sure there are many flaws in research and related work remains to be further studied.
Keywords/Search Tags:SVM, Email, Classification, Feature Selection
PDF Full Text Request
Related items