| IB method is a method of data analysis based on the information theory, and it treats the data analysis process as a data compression process. Given a joint probability distribution between a random variable X and an observed relevant variable Y, IB method is able to find the best tradeoff between accuracy and compression when clustering the variable X, and further it will get the internal law of data model effectively. IB method has been applied many fields as a method to extract feature and compress information, and its performance is outstanding.Spam problem is a serious network problem which is bothering people's daily life and work. Without considering the problem of describing the unbalance class, the existing spam filtering algorithms could not precisely extract the feature of the spam class which is taken as the rare class. While filtering spam, the existing algorithms always achieve high precision by sacrificing the recall, and vice versa. To build a model that maximizes both the precision and the recall is a key challenge of the spam filtering algorithm.In allusion to the problem that the recall and the prevision of spam filtering can not promote simultaneously, this paper proposes the one-class spam filtering algorithm based on IB principle. The algorithm treats the spam filtering problem as one-class problem, in the first place this algorithm constructs a co-occurrence matrix from spam class for data preprocessing, and then it use extended IB algorithm to obtain the center of spam and filtering principles. From the experiment in this paper it can be seen that, while using the algorithm, the recall, precision and accuracy of spam filtering are more outstanding than other algorithms, moreover, the recall and the prevision of spam filtering can promote simultaneously. By using this algorithm, this paper extends the application field of the IB principle and offers a new research idea for spam filtering. |