| The explosion of P2P file sharing has a significant impact on Internet. The "bandwidth gobble up" characteristic results in a huge depletion of bandwidth, lowers network performance, makes quality of service deterioration, even causes network congestion. To control and manage P2P applications, P2P traffic has to be accurately identified.This paper begins with the analysis of Peer-to-Peer network characteristics, then researches the methods of Peer-to-Peer traffic identification such as PORT method, TCP-UDP method, IP-PORT method, DPI method, and analyzes their advantages and disadvantages. To solve the problem that current traditional identification methods can't identify encrypted data, we presented a novel identification method:the double-feature identification method based on the features of network flow and P2P network characteristics. It is the first time that the method uses both network traffic characteristics and features of P2P networks to identify P2P traffic.The main work and innovation are as follows:(1) First, the method clusters network data flow using K-means algorithm though network traffic characteristics(such as TCP/UDP ratio, average length of IP packet and average arrival time). Then, using IP addresses which represent hosts and connections between them, this method builds IP-Graph. Finally, it identifies the P2P traffic by characteristics of P2P network(such as P2P networks with large scale, multiple connection when interaction). Having no relation to the content of the data stream, it can get better result for encrypted data traffic.(2) In order to test this method, we designed and developed an experiment platform, on which the identification model was tested. The experiment's result shows that there are some false positive, but it has the higher correct rate and the lower false negative and it has some practical value. |