Font Size: a A A

Pyramid Selling Recognition Based On Text Classification Using SVM

Posted on:2020-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:S X SuFull Text:PDF
GTID:2416330623456285Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Based on the Internet,Internet pyramid selling scheme is a new type of pyramid scheme with the characteristics of strong concealment,strong deception,wide coverage,and fast dissemination.Internet pyramid selling schemes disrupt the good order of the Internet,hinder the development of economy,and cause a crisis of credibility,which seriously affects social harmony and stability.There are many criminal methods for online pyramid schemes,which is often "propaganda and promotion" on the Internet.Text is one of the most widely used information forms in online pyramid selling activities.Data mining technology is widely used in network public opinion analysis,but has few applications in the pyramid selling organizations recognition task.And the propaganda articles of pyramid selling organizations are different from those of normal company.Therefore,this paper proposes to identify the pyramid selling organizations company or group through the method of text classification.Text classification based on support vector machine is a mainstream text classification scheme.This paper improves the text classification scheme based on support vector machine by studying the propaganda articles of pyramid selling organizations and combining the characteristics of the propaganda articles of pyramid selling organizations.In this paper,the recognition of pyramid selling organizations is essentially a task of text classification.This paper analyzes and summarizes the characteristics of pyramid selling texts,and combines these features to improve the text classification scheme based on support vector machine.In addition,this paper also does some research on the improvement of the support vector machine algorithm in the pyramid selling organizations recognition task.The main research work of this paper are as follows:(1)In this paper,we propose a feature weighting algorithm using the information of features' distribution.Considering the shortcomings that traditional feature weighting algorithm ignores the category information,the algorithm improves the traditional feature weighting algorithm by using the distribution of features between categories.The algorithm improves the accuracy of the recognition of pyramid selling organizations by giving higher weight to features with stronger class distinguishing ability.(2)In this paper,we propose a new text representation model for the binary classification problem,the topic vector space model.On this basis,this paper combines the topic vector space model with the characteristics of the propaganda articles of pyramid selling organizations,and a specific text representation model for the recognition of pyramid selling organizations,pyramid vector space model is proposed.The experiments in this paper prove that pyramid vector space model has better performance than the traditional vector space model in the recognition of pyramid selling organizations.(3)In this paper,we apply the incremental learning scheme to the construction of support vector machine classifiers.In view of the insufficiency of the support vector machine classifier to rebuild the model when new training samples are added,this paper introduces the incremental learning scheme of support vector machine.Incremental learning programs can greatly reduce the amount of computation when adding new training samples and improve the performance of the recognition of pyramid selling organizations.
Keywords/Search Tags:pyramid selling, text classification, feature weighting, text representation model, incremental learning
PDF Full Text Request
Related items