| With the rapid development of Internet technology,Web applications have become an integral part of people’s daily life,covering many fields such as e-commerce,social media,online banking,and healthcare.However,with the popularity of Web applications,network security threats are becoming more and more serious,requiring Web attack detection systems to defend against network attacks and safeguard the property and private information of enterprises and users from leakage.web attack detection is the process of monitoring and identifying malicious attacks,which can be more accurately and quickly identified through in-depth analysis and classification of malicious payloads.This paper relies on the background of the subject group’s vulnerability security platform,takes payloads as the research object,classifies and identifies malicious attack payloads by introducing algorithms and establishing models,and applies them in the implemented Web attack detection system prototype,which has certain research significance and application value.The main research content of this paper is as follows:(1)To address the problems of high pre-processing false alarm rate and oversized constructed dictionaries in current Web attack detection,this study proposes an improved payload pre-processing method based on TF-IDF algorithm.The method first uses regular expressions to extract payload from HTTP data and segment it to remove redundant symbols and protocol data.Then,an improved TF-IDF algorithm in the field of text classification is applied to weight the word frequencies of the payloads to highlight those attack payloads that are small in number but have a high impact on the classification.The constructed lexicon provides the input source for the subsequent classification model.(2)Based on the proposed preprocessing methods,a deep learning-based feature fusion model is proposed to address the problem that a single model cannot adequately extract payload features.The model uses Text-CNN and Bi LSTM-Attention to extract local word features of payloads and contextual features of long texts,respectively,and subsequently splices and fuses these two features and inputs them into the fully connected layer for classification.As demonstrated by experiments on publicly available datasets,the training speed of the model is significantly improved when combined with the preprocessing method proposed in this paper.Also,the classification accuracy of the model proposed in this paper is better and the false alarm rate is lower compared to a single deep learning model and other classification models in the literature.(3)A web attack detection system is designed and implemented.The system uses a front and back-end separated architecture and is implemented using React and Flask technologies.We used My SQL as the back-end database to store the information and developed several functional modules such as login,attack overview,attack management and user center.In addition,we designed a visual interface for the system to display the data more intuitively.In this paper,we combine the above preprocessing method and feature fusion model for payload classification to achieve the purpose of improving Web attack detection.Experiments on public datasets show that the classification accuracy of the model can reach 99.21% and the false alarm rate is only 0.43%,which can identify malicious attacks well,and the Web attack detection system developed by applying the model can help The Web attack detection system developed by applying this model can help users to resist the harm caused by Web attacks. |