| The two most important differences between P2P network and traditional client/server(C/S) network are non-center and self-organized. These characters attract more people to use P2P network and enhance the number of source nodes. As a result, P2P plays a more and more import role in resource sharing, and develops very fast nowadays.But many problems come after the development. According to statistics, 60%~80% of the bandwidth is consumed by the P2P applications. Too much P2P traffic will overburden network and result in network congestion, and the QoS will be violated. Also, because of non-center, it is difficult to manage P2P network, so many illegal resources even computer virus are spread on P2P network. Hence, it is imperative to develop an effective traffic management method. And how to identify P2P traffic accurately is the key point.In this thesis, the most popular methods to identify P2P flow are researched and compared, and a new system which can fast and effective identify P2P flow is brought forward. The primary work of this paper includes as follows: first, the paper analyses three import identification methods: traditional way, deep packet inspection and classification based on flow statistical patterns, and proposes an integrated system of P2P inspecting based on deep packet inspect and flow statistics patterns. Second, famous P2P applications are studied in structure, communication and process of file exchanging. Flow packet-feature database is built after analyzing many packets. Then the packet-inspect module is finished. After studying the machine learning theory, C4.5 is chosen among the decision tree and Bayesian classification Algorithm. And the attribute reduction algorithm is based on CFS. Last, this system is implemented in windows environment. Experiment shows that this system can identify P2P traffic and performs better than other methods. |