| With the rapid development of internet, more and more internet applications appears. The identification and classification of the traffic that is traversing a network link is a very important task performed by Internet Service Providers (ISP) administrators, in order to have a better comprehension of the applications being used by their users and to analyze how their network is prepared to provide a quality service to their customers. The traditional method used by traffic classification, such as port based classification, is no longer effective. Now the widely used methods are machine learning algorithms, application communication behavior analysis and payload based methods. Among all the above methods, the Deep Packet Inspection (DPI) is the most widely used and most accurate technique. However, in order to use this technique, the DPI system must first have a good set of application signatures. But, traditionally, building this set of signatures is a very time consuming task and demands a high expertise, in order to make them very accurate and specific for each application.This thesis studies the payloads of many application layer protocols, especially P2P protocols, and then an automatic signature extraction algorithm is proposed and the performance and accuracy of the proposed algorithm is studied. After that, the validity and accuracy of the algorithm is verified.The main work and contributions in this paper include: (1)The related work is introduced, in which comprehensive comparisons about the advantages and disadvantages of port based methods, machine learning algorithm based algorithm, application communication behavior analysis based methods and payload based methods are presented.(2) One pure flow collection tool is designed and implemented, which captures packet from the NIC and classifies the flow based on process name.(3) Define the application signature, and analyze the presence rules of it, which will help us when we design the algorithm.(4) The comparison in terms of efficiency and accuracy for several commonly used signature generation algorithm are presented.(5) New automatic application layer protocol signature extraction algorithm is proposed, which can purify the raw signature by frequency control.(6) To verify the accuracy of the proposed algorithm. we design a DPI system demo which help to test the accuracy of given signatures. |