Font Size: a A A

Research And Implementation Of Entity Recognition Model For Cyber Threat Intelligence

Posted on:2021-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:H WuFull Text:PDF
GTID:2518306308477444Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In the age of Big Data,the attack methods used by attackers in cyberspace are increasingly complex,and cyberspace security systems,including personal property and privacy security,enterprise operation security,and national information infrastructure security,are facing severe challenges.As one of the important network security technologies,cyber threat intelligence conducts forensics and correlation analysis of relevant intelligence evidence in cyberspace before the attack occurs,and realizes the identification of unknown threat sources through feature learning.Cyber threat intelligence is leading the network attack and defense from traditional passive respond after the attack event into proactive defense beforehand.The unstructured text intelligence contains a large amount of event-level intelligence with high value,including but not limited to intelligence information such as attack methods and attack strategies.Due to the openness of the network environment,unstructured intelligence source data is also mixed with a large amount of invalid or interference information,so the quick of identification of key and value intelligence is extremely challenging.Based on intelligence itself,this paper studies the identification of intelligence entities in the field of cyber threat intelligence.The main work and contributions are as follows:(1)Aiming at solving the problems of difficulty in unidentified network threat text intelligence entity recognition and key intelligence entity positing,this paper proposed a cyber space intelligence entity recognition model based on Attention_BiLSTM-CRF.In this paper,a language model trained by a shallow neural network is used to pre-train char embedding on a certain scale of mixed corpus,which is used as feature inputs of the neural network layer of the entity recognition model.Then modeling Bi-directional Long Short-Term Memory(BiLSTM)to learn the context of the input sequence bidirectionally and extract the global features of the text sequence.On this basis,in order to improve the accuracy of key entity recognition,this paper builds a character-level attention mechanism(Attention)for intelligence entities,by calculating the similarity between the target character of each state and other characters in the sequence to assign more weight to key intelligence entities.Finally,Conditional random field(CRF)is used to characterize the label relationships with the predicted label sequences output of the neural network layer,learning the dependence of the labels,then avoid illegal sequence output.Manually annotate cyber threat intelligence text information data sets,and use cross-validation to train,verify and test the entity recognition model.The experimental results show that based on the above mentioned entity recognition model,the F1-score of the six major entity recognition tasks in the intelligence event can reach the highest 84.10%.Set up multiple sets of comparative experiments,the results show that the comprehensive performance of the model proposed is better than the existing models.(2)In view of the market requirements of intelligence platforms in security operations,and the functional requirements of users,this paper designs and develops an entity recognition platform for cyber threat intelligence.Based on a thorough investigation of the existing intelligence platform,the platform of this paper includes intelligence data collection module,intelligence entity recognition module,intelligence association analysis module and intelligence visualization module.The intelligence data collection module formulates different collection strategies for different structures and levels of intelligence,realizes the source data collection of multi-source heterogeneous intelligence,and formulates an intelligence dictionary based on the entity expression model of cyber threat intelligence to eliminate intelligence structure differences and achieve the aggregation of multi-source heterogeneous basic intelligence;the intelligence entity recognition module utilizes the Attention_BiLSTM-CRF cyber threat intelligence entity recognition model to automatically identify and extract key intelligence entities in unstructured intelligence text;the intelligence association analysis module utilizes the related searches in intelligence database and the word relationship model based on intelligence word embedding to realize the association discovery and attack analysis for basic intelligence samples and event intelligence entities respectively.Finally,the information visualization module of the platform is responsible for displaying the results of the above functions.
Keywords/Search Tags:cyber threat intelligence, entity recognition, deep learning, attention mechanism
PDF Full Text Request
Related items