| With the development of computer network technology, network has infiltrated into every corner of the world. However, when we on the Internet, we often encounter to slow responses, and even error responses. The distance between the actual performance of the network and the quality of service expectations is very large, and it has become the focus. In view of this, establishing a comprehensive middle layer CDN (Content Delivery Network, Content Distribution Network) in the existing networks is the most popular selection currently.Between this, the content routing system is the essential part of the CDN architecture. Because, according to the user's request and the condition of the edge server, the best edge server is selected. And in the design of new content routing system, the user cluster division is also very important.And data mining is the procedure of extracting of implicit, original, useful knowledge in the database, which is already applied in many fields in recent years. Clustering analysis is one of the main technology measures in the research on data mining with a mass of theories and methods achieved. With the establishment of Decision Support System such as data warehouse and the requirement of business intelligence in the data intensive enterprises, data mining has been used in many new applications and the research on data clustering is faced with a lot of new challenges.CDN is one of the typical data intensive web site, dividing the customers and supplying different edge servers to different user's clusters has already become instant demand to improve CDN's performance. A new algorithm clustering the data sets with mixed numerical and categorical values is researched based on the requirement of users segmentation and the characteristic of the data in the CDN application is proposed in the paper.On the condition of not use third-party software in the client-side, the paper researches the optimization technology of the CDN's route, raises a new optimization program of the CDN's route, and researches clustering algorithm is used in division of the CDN's users.In this paper, the research is summarized as follows:1. The technology of CDN and the routing system are introduced briefly, with emphasis on the methods of based DNS redirection, HTTP redirection and the WAN triangular redirect, which are the methods of the global load balancing. Also, the advantages and disadvantages of redirect methods are concerned.2. Client-side probes technology is studied. And in the light of the lack of the current CDN routing technology, the combination of the global load balancing technology and the client-side probes technology is to expound and prove.3. Studied data mining technology, especially Web data mining technology, and summed up Web data mining technology development direction.4. The application of clustering technology in CDN users' segmentation is researched. And discusses the basic theory of customer segmentation, methods and process.CDN applications are used more and more widely. And it can solve the problem of insufficient network bandwidth. CDN which meets the people's needs will usher a better development opportunities. So that the combination of data mining technology and CDN content routing technology, which is directly applied on the new content routing technology, will also better reflect the practical value and its reference value. |