The Application Research On IB Method In Spam Filtering

Posted on:2012-04-30

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Wang

Full Text:PDF

GTID:2218330338457332

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

IB method is a method of data analysis based on the information theory, and it treats the data analysis process as a data compression process. Given a joint probability distribution between a random variable X and an observed relevant variable Y, IB method is able to find the best tradeoff between accuracy and compression when clustering the variable X, and further it will get the internal law of data model effectively. IB method has been applied many fields as a method to extract feature and compress information, and its performance is outstanding.Spam problem is a serious network problem which is bothering people's daily life and work. Without considering the problem of describing the unbalance class, the existing spam filtering algorithms could not precisely extract the feature of the spam class which is taken as the rare class. While filtering spam, the existing algorithms always achieve high precision by sacrificing the recall, and vice versa. To build a model that maximizes both the precision and the recall is a key challenge of the spam filtering algorithm.In allusion to the problem that the recall and the prevision of spam filtering can not promote simultaneously, this paper proposes the one-class spam filtering algorithm based on IB principle. The algorithm treats the spam filtering problem as one-class problem, in the first place this algorithm constructs a co-occurrence matrix from spam class for data preprocessing, and then it use extended IB algorithm to obtain the center of spam and filtering principles. From the experiment in this paper it can be seen that, while using the algorithm, the recall, precision and accuracy of spam filtering are more outstanding than other algorithms, moreover, the recall and the prevision of spam filtering can promote simultaneously. By using this algorithm, this paper extends the application field of the IB principle and offers a new research idea for spam filtering.

Keywords/Search Tags:

IB method, one-class problem, spam filtering

PDF Full Text Request

Related items

1	Spam Filtering Method And System Realization
2	Research And Implementation Of A Three-Dimensional Hybrid Spam Filtering Method
3	Research On Spam-filtering Method Based On Visual Features Analysis
4	The Research Of The Spam Filtering Method Based On The Behavior Identifying
5	A Spam Hybrid Filtering Technology Research
6	Research On Spam Filtering Method Based On Social Networks
7	Research And Implementation Of Spam Pages Filtering Based On Bayesian And Decision Tree Algorithms
8	The Design And Implementation Of Image Spam Filtering System Based On Cascade Method
9	SVM-Based Novel Method Of Online Spam Filtering
10	On Technology Of Image-Based Spam Filtering