Research On Malicious Code Detection Method Based On Time Series Feature

Posted on:2022-01-11

Degree:Master

Type:Thesis

Country:China

Candidate:Q Q Gao

Full Text:PDF

GTID:2518306326984699

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

Malicious code through web links,system vulnerabilities,email and other ways to break into computer systems,causing great losses to users,especially for today's most popular desktop operating system Windows.Therefore,the study of dynamic malicious code detection is of great significance to build a safe and green network environment.In recent years,researchers have used data mining method to detect malicious code and achieved high recognition rate.However,traditional machine learning methods require security personnel to manually design features to construct detection models,which requires high manual experience.Although the deep learning method can automatically extract features,it is difficult to explain the decision basis of the model due to its black box nature.Data mining research meaning is to help people find the key information in the data,we call it interpretability.The methods in the field of time series classification can be used for reference in the aspect of automatic feature extraction and model interpretation.Focusing on automatic feature extraction and interpretability,this paper presents the dynamic API call sequence of malicious code as time series,and studies the method of malicious code detection based on time series feature.The main work of this paper includes:(1)The method of malicious code detection based on time series feature is studied.The experimental analysis shows that there is a big difference between the amount of information contained in the sequence segment of malicious API call and the sequence segment of normal call.In this paper,by calculating the local information entropy of dynamic API sequence,the API call sequence is converted into entropy time series.Based on the level of information entropy,the Shapelet transform algorithm in time series classification is used to automatically extract time series features and train the classifier to realize malicious code detection.The experimental results show that the proposed method is more accurate than the traditional methods,and the results can be interpreted.(2)Aiming at the shortcoming of low efficiency of Shapelet algorithm for time series classification,the acceleration algorithm based on Shapelet is studied.Based on the idea of random projection,an improved Shapelet transformation algorithm,Hash Shapelet transformation algorithm,is proposed to improve the time efficiency of the existing algorithm.Different from the feature extraction at the level of information entropy in the second work,the improved Shapelet transformation algorithm is used to automatically extract the timing feature from the original API call sequence and realize malicious code detection,which improves the accuracy and time efficiency of malicious code detection.(3)The malicious code detection system based on time series features is designed and implemented,and the overall design and function modules of the system are introduced in detail.

Keywords/Search Tags:

Malicious code detection, time series classification, Shapelet, information entropy, interpretability

PDF Full Text Request

Related items

1	Research On Time Series Classification Algorithm Based On Shapelet Learning And Transformation
2	Research On Clustering-based Fast Shapelet Discovery Algorithm And Its Application
3	Research On Network Traffic Classification Methods Based On Time Series Features
4	Time Series Classification Methods Based On Deep Shapelet Learning
5	Research On Data Mining Method Based On U-shapelet Time Series
6	Design And Implementation Of JavaScript Malicious Code Detection Tool Based On Information Entropy
7	Research And Application Of Time Series Mining Method Based On Shapelet
8	Research On Key Technologies Of Malicious Code And Emergency Response In Communication Networks
9	Research And Design Of Malicious Code Detection Technology Based On Deep Neural Network
10	Research On Android Malicious Application Identification And Malicious Family Classification Technology Based On API Call Analysis