Font Size: a A A

Conv-BiLSTM: A New Intelligent WebShell Detection Network Based On Bi-LSTM

Posted on:2022-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:J J MinFull Text:PDF
GTID:2518306491485644Subject:Engineering and Computer Technology
Abstract/Summary:PDF Full Text Request
With abundant information resources,the Internet has gradually become the most direct and convenient bridge for people to obtain information.Correspondingly,security issues of the server-side have become an essential part of the network security.As a medium used by attackers to attack Web servers,WebShell is an important research direction in website security attack and defense.The existing WebShell detection methods are mainly divided into three categories: The first is file-based WebShell detection,the second detection method is based on Web system logs and the third way is based on HTTP traffic.There are some common issues on these detection method:1.Feature extraction is not comprehensive enough,2.The detection effect needs to be improved.Considering problems above all,the work of this paper is as follows:(1)In terms of feature extraction,this article conducts research on various text conversion methods,including formatted source code,Abstract Syntax Tree(AST),Operation Code(OPCODE).Lots of word vector extraction methods are carried out,including Term Frequency-Inverse Document Frequency(TF-IDF),Word2 Vec,and Transformer-based Bidirectional Encoder Representations from Transformers(BERT).This article uses the above text preprocessing method to complete multiple sets of experiments.Through experiments,the following conclusions are drawn: the best detection performance is the combination with OPCODE as text conversion method and BERT as word vector extraction method.(2)To improve the detection effect of WebShell files,this paper adopts the text preprocessing method derived from part(1)and proposes an improved network Conv-BiLSTM based on Bi-directional Long Short-Term Memory(Bi-LSTM.)Initially,the network uses custom size and the number of convolutional layers and pooling layers to perform feature extraction and feature dimension reduction on word vectors.Then,it uses a Bi-LSTM architecture to perform further feature extraction on text information and finally completes the classification task.Compared with multiple sets of experiments in the part(1)and current WebShell detection products used in the industry.It turns out that Conv-BiLSTM achieves better detection results in this article’s experimental environment.
Keywords/Search Tags:WebShell, Machine Learning, Bi-LSTM, PHP, BERT
PDF Full Text Request
Related items