| Customers are one of the most important resources of an enterprise, customerretention and satisfaction drive enterprise profits. With the development of the internet,the market competition intensifies, customers become more and more diversified. Tobetter identify customers, allocate limited enterprise resources and improve corecompetitiveness, it is important to do customer segmentation. Customer segmentationis the key to successful customer retention. After years of development, the theory andmethods of customer segmentation are constantly improved. It also has been used indifferentiated marketing in various industries, such as Banking Business,Telecommunications Industry, Retail Business and some other data-intensive industries.However, it’s less used in logistics industry.Traditional methods of customer segmentation are mainly based on experienceand statistical classification, which are not strong enough to deal with increasedenterprise data and more complicated customer segmentation. The appearance anddevelopment of the data mining technology makes it possible to find new solutions forbig data based and complicated customer segmentation cases.This thesis is based on the domestic and foreign research status of customersegmentation and data mining, and proposes a data mining based approach customersegmentation. It uses a hybridized Rough-PCA approach for attribute reduction, laterwith K-means for clustering.First, it gives an introduction to the theory basis of customer segmentation anddata mining, together with a summary of applicable data mining methods in customersegmentation. Then, based on the characteristics of logistics industry and customersegmentation criteria, it builds the customer segmentation evaluation index system forlogistics enterprises. The index system is constructed with indexes from customervalue and customer loyalty dimensions, with all19indexes included. Later, for all theindexes obtained, a hybridized Rough-PCA approach is used for attribute reduction,and K-means for clustering. The main advantage of this approach stems from the factthat it will produce a reduced set of attributes specify the maximal variances in the dataas well as the discriminative features most adequate for classification, with minimumloss of information.Finally, the proposed customer segmentation model is verified with the data from HK logistics enterprise. The result shows that the proposed model in this article ismore effective. |