Font Size: a A A

Research And Application Of Knowledge Extraction Methods For Cybersecurity

Posted on:2024-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:X HuangFull Text:PDF
GTID:2568307079472054Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the increasingly severe network security situation of the Internet in recent years,it is necessary and urgent to strengthen the effective use of the massive and complex data in the network and promote the intelligent development of technologies in the field of network security.Knowledge graph technology is an important technology to solve the problem of efficiently mining relevant core contents and forming high-value data assets from massive and heterogeneous data.The data in the field of cybersecurity has a large number of specialized terms and semantic complexity,and it is difficult for the traditional graph construction methods to accurately obtain the entities and relationships in the data.Therefore,this thesis conducts research on the key technologies of cybersecurity knowledge graph construction,focusing on the entity extraction and relationship extraction technologies in the cybersecurity knowledge extraction stage,and its main work is as follows:(1)To address the difficulty of entity extraction caused by many terms and complex structure of cybersecurity entities,an entity extraction model based on multidimensional word-level feature fusion embedding is proposed to construct multidimensional features such as external prior knowledge of word granularity and location features to provide rich cybersecurity entity features for the model,and then multidimensional word-level feature fusion is performed by a BERT model based on cybersecurity corpus pre-training.Then the fusion vector and feature vector are feature encoded separately,and the entity feature representation is reinforced by multi-headed attention,and finally the label decoding is performed by the CRF model to achieve cybersecurity entity extraction.The experimental results show that the entity extraction model proposed in this thesis outperforms the baseline model for network security entity extraction.(2)To address the problems of complex cybersecurity semantics,difficult relationship mining and overlapping triples,this thesis proposes a relationship extraction model based on deep graph convolutional neural network,which uses features at two scales,text domain and spatial domain,to represent the connections between cybersecurity entities,and by constructing sentence syntactic dependency graphs and weighted entity connection graphs,which are input to two GCN networks to extract features respectively,so that the model can learn the the interaction between the global structural features of the sentences and the local entities.Finally,in order to solve the overlapping triad problem,entity pair enumeration and then multi-label classification of entity pair relationships are performed to achieve the extraction of network security triads.The validity of the model is verified through experiments on public datasets NYT and Web NLG,and the performance of the model in extracting overlapping triples is evaluated by reconstructing the dataset.Finally,the ability of the model to extract cybersecurity relations is evaluated on a cybersecurity dataset.(3)Integrate the functions of the above two models,develop a knowledge extraction system for cybersecurity,design and implement the application of entity extraction and relationship extraction models for cybersecurity in the mode of software engineering,provide entity extraction,relationship extraction and knowledge graph construction functions for cybersecurity and their visual display,and allow users and administrators to manage and maintain the relevant data in the system to improve the usability of the system.
Keywords/Search Tags:Cybersecurity, Knowledge Graph, Entity Extraction, Relationship Extraction, Multidimensional Word-level Features
PDF Full Text Request
Related items