| Network traffic analysis and classification technique is a basic but important method for network monitoring. It is widely used in various network activities, such as QoS guarantee, network security, billing, etc. With the development of high-speed network technique, especially the appearance of the network technology with G bits or T bits, due to the high complexity and low efficiency of the existing network traffic classification methods, they become less capable of handling the high speed data loaded to the network. As such, new methods should focus not only on classification accuracy, but also on other performance issues, such as efficiency and throughput. In order to reach these goals, we conduct an investigation of applying feature selection techniques to high-speed network traffic flow classification. Feature selection is an important methodology for data mining problems. Removing irrelevant and redundant attributes from original data set can greatly simplify building classifier models.In this thesis, we carry out deep analysis into the statistical flow features of P2PStream, a typical and popular network application, under ADSL and CDMA network respectively, so as to demonstrate that each application has certain statistical flow features to be distinguished from others. Then we consider applying feature selection techniques to network traffic flow classification and conduct experiments using the actual ADSL and CDMA network data collected from the Internet of China. The results show that building with an appropriate feature selection method can simplify the network traffic classifier while achieving satisfactory classification accuracy. |