| P2P is a network service which has a rapid development in recent years and opens the edge of the network resources. P2P users can more quickly download resources, but also be able to find some resources hard to find in peacetime. The number of users becomes more and more because of its convenience and fastness.As a result of its taking up more network resources,it causes network congestion and hampers the development of normal business and the operations of other critical services. Telecommunications operators and other companies are planning to identify P2P traffic in the network, which can take further restrictions on operations.Based on the static port, protocol and payload,the method of P2P traffic analysis has achieved good classification results. However, as the development of P2P protocols, many P2P services use dynamic ports, and even take up a fixed port, and the payload is also encrypted,so the results of the traditional classification are no longer visible, or no longer valid.This research is based on classification of P2P traffic flow, considering the port and the flow properties which are independent of the port and load.The machine learning as a guide, SVM algorithm is used to obtain classification model and the classification of the flow can be predicted based on the classification model.This paper mainly completes the following aspects of work:1. Review P2P. BT is most widely used among P2P,so focus on the work process and the principle of BT;2. Study machine learning.Describe its study meaning and content,and research its application process in flow identification. 3. Study SVM.Based on the research of its principle,propose general idea of identifying P2P flows with SVM;4. Test libSVM. Study its input methods to determine the output data format of the flow properties;5. Code for getting flow attribute data.Due to the need to handle a large number of packages to get flow attribute data, program in order to be able to quickly get bulk flow attribute data;6. Realize that libSVM predict flow.From the test data,understand the feasibility and accuracy of identifying P2P flows with SVM-based machine learning.By identifying P2P flows, Network operators and other network managers address the misuse of bandwidth resources, network performance degradation, copyright protection and other issues which are caused by the rapid growth of P2P.To network researchers,the P2P traffic identification is also a hot issue.There is need for in-depth understanding and study of P2P traffic and P2P behavior. In addition, as the study of how to use machines to simulate human learning activities,machine learning has been used widely in the fields of data mining, which gives us inspiration. This paper has a try in the study of the feasibility and accuracy of P2P flows identification using machine learning. |