Font Size: a A A

Research On Short Text Classification Technology And Its Scene Application

Posted on:2018-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y FengFull Text:PDF
GTID:2359330512474194Subject:Master of Applied Statistics
Abstract/Summary:PDF Full Text Request
Consumer is a very important part of commodity trading,For the mechants,the data of consumer is critical.The data of consumer provides data support for merchants' business decision-making,such as richer product categories and improving service quality.Therefore,how to mine the important value from the massive data is of great practical significance to the information age.The data of consumer's Commodity Description is Chinese short text less than 30 words,Chinese short text classification technology research has not yet reached the maturity stage,therefore,the study of short text classification technology is the primary task of this paper.First of all,this paper analyzes the domestic and foreign research on text categorization and expounds the whole process of text categorization.And then,according to the characteristics of transaction data such as it has sparse features,diverse categories and highly unbalanced samples,the transaction text data is first classified into rules,which meet the classification criteria of the rules directly to the classification results,and then the rest of the text using machine learning methods for classification.In the text classification of machine learning,the feature selection method lasso in the regression model is applied to the feature selection of the classification model,and classifier uses SVM(support vector machine),and then compare the results of "Lasso + SVM" with the three commonly used methods.Research results show that the accuracy,recall rate and F1 value of this text categorization method have reached a high level through adopting the method of combination of rule classification and machine learning classification to classify the text data.Among it,the improved text classification method "Lasso + SVM" has better classification effect than the commonly used methods.Next,we study two expand application scenarios of transaction data text classification.Scenario 1:Smart recommendation.Using the improved text classification method,classify the text of User's Commodity Description as a prediction sample over a period of time,to get user's transaction category,next the user's transaction type is combined with other data to construct a user's portrait of the buyer,and then for the different characteristics of the portraits of consumers to figure out the next consumer behavior,and thus to the smart recommendation of consumer goods and services.It can help businesses or sellers to improve marketing efficiency and reduce operating costs.Scenario 2:P2P platform risk management.Select the user's transfer data from the above transaction type.The use of transfer data to establish transfer network,is used to determine whether the customer and others have economic ties;Using call data to establish a relationship network,which used to determine whether the customer and others have life contact.This paper proposes to combine two kinds of relational networks to form a wind control relationship circle,to explore potential customers in the future,and to provide decision support for P2P platform loan risk management.The innovation of this paper has two aspects.First,according to the characteristics of the text data of the consumer's Commodity Description,this paper adopts the method of combination of rule classification and machine learning classification to classify the text data.And in the process of machine learning classification,the classification method of lasso + SVM is adopted,it puts forward a new method for text categorization.Second,Transfer data comes from consumer transaction data(including the name of the transaction and the parties to the transaction),combined with the call data to construct a risk management relationship circle,providing a new idea for the risk management of P2P platform.
Keywords/Search Tags:text classification, portraits of people, intelligent recommendation, loan risk management
PDF Full Text Request
Related items