| The rapid development of network technology and application has brought us new network technologies and concepts like Cognitive Network and SoftwareDefined Network(SDN).However,it also brings greater management challenges and security risks.Latest researches have introduced emerging machine learning and deep learning methods into cyber security,and especially attack detection system.Nevertheless,the flexibility enabled by SDN has not been fully utilized.Introducing the perception and decision-making close loop of cognitive network into the management of SDN,we use not only network programmability but also existing security technology to propose a machine-learning-centered humanassisted attack defense system.The core of the mew system lies in using SDN Service Function Chain to flexibly handle suspicious traffic that the classification model uncertain to distinguish,while using clustering model to analyze and process corresponding suspicious logs.Under the premise of maintaining low false alarm rate and low user experience cost,it has thus the ability of directly intercepting some new attacks,actively alarming security environment changes,and assisting itself's upgrade by extracting crucial data samples.We present two new Web security datasets through the studying and tagging of two sets of real HTTP access logs.Firstly,we tagged approximately 15,000 access logs publicly available from Apache that contain multiple types of attacks,while having no categorized attack labels.Secondly,we removed the attack logs from a large number of the latest access logs from an anonymous website,and provided the method to manually generate attack logs by combining its safe logs and three Web-attack datasets by different attack types,extracted from vulnerability scanning tools.In terms of algorithm design,we applied character-level convolutional neural network to the feature extraction of Web logs to replace traditional security ontology.Experiments show that this new method can project the original logs into a high-dimensional hidden space,and provide effective features for downstream machine learning tasks such as attack detection.Moreover,it has the advantages of relying no more on prior security knowledge and shielding the difference of log formats.For the downstream attack detection classifier,we demonstrated the characteristics and advantages of the high performance extreme learning machine.Even at a high accuracy of 95%,experiments show that adding suspicious class can further improve its usability.Suspicious samples obtained by applying the simulated zero-day attack on the classifier helped us proving the feasibility to use DBSCAN as our suspicious sample clustering model with further experiments.Evaluations under two security environment change scenarios,"zero-day attack" and "fake zero-day attack",verified the effectiveness of our representation,classification and clustering algorithm as a whole collaborating system.When a new type of Web behavior appears,the number of suspicious samples found by the classification model increases rapidly.While the alarm is thus triggered,the clustering model also accurately filters the new behaviors in the suspicious samples into its largest cluster.Since the ratio of new behavior samples in the largest cluster in the initial alarm exceed 90%,by quickly tagging this whole cluster manually,the classification model can be updated and respond to the security environment changes immediately. |