Font Size: a A A

The Analysis, Based On The Incremental Classification Of Credit Card Customers

Posted on:2007-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y H XuFull Text:PDF
GTID:2209360185469251Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining is a field developing rapidly in recent years. Today, with exploding information, it even displays the huge function and powers. Data mining has been extensively applied to many domains, such as finance, retail and medical treatment etc.In this paper, we apply data mining technology to classify credit card customers according to their savings product. By the model, we can analyze the characteristic of each customer and his transactions, and thus predict the value of the customer. We use a scalable, efficient and high accuracy classifier named SLIQ to make decision tree, and AdaBoost to achieve higher accuracy.For the sake of further improving performance, this paper made some improvements to SLIQ. First, we use a new splitting index to evaluate the"goodness"of the alternative splits for attributes instead of gini index. Secondly, we regard categorical attributes with only two possible values as numeric attributes when evaluate splits. By this way, we can get higher accurate, less expensive and smaller model. From the result of the test using testing dataset, the accuracy of the model can attain 90% or so.In this paper, we also put forward a incremental learning algorithm for such a model. When the number of new samples attains or exceeds the valve, we use these samples to make a model, and merge this model with current model, thus get a new current model. The merger of the model includes the merger of the decision trees having the same tree number and the calculation of the new weight for the merged tree. When merging decision trees, there are two problems. One is that, if two overlapping leaves coming from these two trees respectively have different labels, what label will this intersection be? Another is that the new tree will be too big with too many leaves. In this paper, we will give some useful solutions to this two problems:(1) Making use of the pre-sorted attribute lists while deciding the labels of intersections. (2) Using pruning strategy in uniting neighborhood...
Keywords/Search Tags:credit card customer, SLIQ, AdaBoost, incremental learning, merge
PDF Full Text Request
Related items