| E-mail brings greater convenience for people with low-cost and high-speed, it has been widelyused nowadays. While at the same time, the outlaws are the use of these advantages to an attack onpeople’s mailbox to send large amounts of spam. Researchers have developed a lot of spamdetection and filtering methods.Spam filtering technology continues to improve, driven spammersto explore the new production technology of spam. So, Image spam has become today’s dominantgarbage information media. According to McAfee’s report in2007, the proportion of Image spamspam is about30%. Image spam is a form of advertising spam text embedded into the picture, ase-mail attachments or directly as text content, willfully spread to the e-mail client.The paper systematically analyzes the background, development status and researchsignificance junk image filtering, in-depth study and research on key technologies of junk imagefiltering, on the basis of existing research results, mainly to the completion of these aspects ofinnovation training through online learning support vector machine algorithm to obtain a highlyaccurate and stable classifier. Get labeled samples requires a lot of manpower, material, relativelyeasy to acquire unlabeled sample. So, we never label samples extracted with informative samplepoints, add and update the training set, constantly updated to take advantage of the training set totrain the support vector machine, stabilized until the classification accuracy, to obtain highlyaccurate classification results. |