Font Size: a A A

Research On The Loss Of Customers Based On Logistic Regression And SVM

Posted on:2019-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:F C HeFull Text:PDF
GTID:2429330563458865Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Classification is a common problem in real activities.Logistic regression and SupportVector Machine are the classic methods widely used in classification problems.Logistic regression is a kind of generalized linear method,by fitting the nonlinear relationship between interpretation and response variables,probability is given and classified.model principle is simple and easy to explain,it is suitable for large-scale categorical data;The support vector machine classification model classifies the optimization problem by maximizing the interval and introducing the kernel function in the high dimension.The mathematical theory is elegant and performs better on small samples,high dimensionality,and non-linear data.However,compared to Logistic regression,the model is less explanatory and the theory is more complex.Both methods have advantages and disadvantages,so this article will put two classic methods together to solve the classification problem of Unicom customers' loss.Firstly,this paper introduces the theory of Logistic regression,including the likelihood estimation method for regression coefficients,the Mallows quasi-likelihood estimation and consistent error estimation.Then it introduces the principle and algorithm derivation of support vector machine.Finally,it uses Logistic regression and support vector classifiers to model the data based on the loss of the Unicom customers.Among them,in the establishment of Logistic regression model,in order to resist the influence of outliers in the maximum likelihood estimation of Logistic regression,the data were taken outlier removal and robust estimation,and the regression coefficients fitted by several processing methods were performed.In contrast,it was found that when the proportion of outliers is small,the maximum likelihood and robust estimation of logistic regression are almost the same.Eventually,Logistic regression and SVM were found to have similar classification results,but the SVM classification model was generally better than Logistic regression,especially for the smaller proportion of unbalanced data,the prediction accuracy of the support vector machine classification model was Significantly better than Logistic regression.
Keywords/Search Tags:Logistic Regression, Support Vector Machine, Classification Problem, Loss of customers
PDF Full Text Request
Related items