Research On Cyber Security Threat Discovery And Tracking Technology Based On Topic Detection

Posted on:2020-04-21

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhou

Full Text:PDF

GTID:2428330572972245

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The emergence of malicious software and advanced persistent attacks(APTs)requires security experts to analyze and detect network threats in real time from open source data,transforming them into readable threat intelligence to helps security analysts respond quickly and defend against emerging cyber threats.However,it is not possible to manually identify cyber threats from large amounts of open source unstructured texts.For these various reasons,we need multi-dimensional knowledge discovery and data mining methods to help our system improve and understand network threats.First,we extract threat-related information from multi-source security data,and then synthesize these knowledge fragments to create a higher-level concept to describe the phenomenon of potential threats.It can be described as real-time identification of upcoming security topics from fragments of open source threat information,forming threat intelligence,and helping security-related personnel to quickly respond emerging cyber threats.Most previous researches(security week,ThreatBook system)focus on using machine learning to discover general threat categories rather than real-time threat targets.Previous systems need to enter keywords or just give general threat categories(Virus&Threats)instead of specific Threat(APT).Therefore,we propose a novel FAC-CTI(network threat intelligence detection based on domain feature extraction and improved hierarchical clustering)method to analyze open source threat data and identify emerging threat topics in real time.The FAC-CTI method of threat topic detection in this paper is mainly composed of three parts:data collection and preprocessing,key feature extraction and topic clustering.In the first part,the data acquisition module collects all kinds of security data of security BBS and security information website.In the second part,we proposed three feature extraction methods:①based on the keyword recognition method of TF-IDF(Term FrequencyInverse Document Frequency),this paper proposed the incremental TF-IDF method considering the word location and part of speech,calculating the word weight,and extracting the keyword features;②combining the word vector model of transfer learning to train word vectors,this paper proposes the Latent Dirichlet Allocation(LDA)method of word similarity and domain filtering strategy to identify the theme features;③The entity identification method can identify the domain-specific entity features such as place names,person names,security organizations and so on.In addition,the feature fusion technique is used to integrate the above features and build the feature vector of the paper.Unlike previous open source threat intelligence work,the above feature extraction method makes full use of security domain knowledge,extract features with domain knowledge,and construct the article vector.In the third part,based on the HAC(hierarchical clustering)algorithm,this paper proposes an improved hierarchical clustering algorithm to cluster articles in each period of time,mine security topics,and real-time recognition of emerging topics or the continuation of historical topics.The experimental data set comes from the open source wiki,as well as eight security websites and BBS collected by crawlers.The experimental results prove that the FAC-CTI method has remarkable performance and can identify the threat topic well.The recall rate,accuracy rate and F value of threat topic detection on the two datasets are all above 0.98,and the experimental results were higher than other common topic detection methods.

Keywords/Search Tags:

threat intelligence, topic detection, transfer learning, feature fusion

PDF Full Text Request

Related items

1	Research On Threat Intelligence Analysis Based On Multidimensional Data
2	Research On APT Detection Technology Based On Threat Intelligence
3	Research On Threat Hunting Methods Based On Threat Intelligence And Lineage Behavior Grap
4	Research On Internal Network Security Threat Detection Method For Hybrid Network Data
5	Research On Network Threat Intelligence Information Extraction Based On Deep Learning
6	Design And Implementation Of Intelligent Web Threat Intelligence Detection System
7	Research On The Construction And Application Technology Of Threat Intelligence Knowledge Grap
8	Research On Parallel Mining Technologies Of Threat Intelligence For Internet Big Data
9	Study On Trustworthy Analysis Of Threat Intelligence Based On Machine Learning
10	Multi-task Classification Technology Based On Feature Fusion And Threat Evaluation Of Malware