Research On Key Technologies Of Knowledge Extraction For Chinese Threat Intelligence

Posted on:2023-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:Q Y Wang

Full Text:PDF

GTID:2568306905468904

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

The Internet has become an important issue in the new era to improve production efficiency,promote innovation and change,and accelerate human development.With the vigorous development of network information technology,cyberspace security threats have gradually penetrated into social production and life.Traditional data mining and analysis methods within the field of cyber security are no longer able to support the future of China’s Internet industry towards a new historical inflection point.Knowledge graph,as a technical means to process and visualize unstructured data,has created a huge research boom at home and abroad.The purpose of this paper is to study the knowledge extraction techniques involved in the process of knowledge graph construction in the Chinese threat intelligence domain,including named entity recognition techniques and entity relationship extraction techniques.At present,Chinese named entity recognition mostly uses text sequences to match with dictionaries to get vocabulary,and then uses grid structure or graph structure to introduce vocabulary information,but these two methods of integrating vocabulary knowledge do not consider global semantic interaction,introduce more interference vocabulary,and fail to effectively solve the problem of vocabulary boundary conflict.Most of the Chinese entity relationship extraction currently uses a character-level input-based model to classify relationships,which does not make full use of the lexical information and entity information in the input sequence.To address the above problems,for the named entity recognition task,this paper proposes a knowledge fusion method based on Lexicon-matched Word Inject(LWI),which is innovative in the way of input sequence semantic information extraction and lexical information utilization in the sequence.The method uses pre-trained language models to encode characters,captures sentence context features by Transformer＿Encoder model,then injects lexicon word knowledge for each character,and then integrates characters with different words based on multi-headed self-attentive mechanism to improve the recognition effect.For the relationship extraction task,this paper proposes a relationship extraction method based on multi-feature embedding to innovate on the model embedding feature information.The method investigates how to perform multi-feature embedding in the input representation layer of the entity relationship extraction model.The multi-feature embedding process is to integrate the head-to-tail entity embedding vector,the head-to-tail entity position feature vector relative to a character,and the external vocabulary embedding vector in the input sequence into the character vector as the input of the model encoding layer,and then use the BiLSTM model to perform feature extraction,so as to enhance the extraction effect.To validate the model effect,this paper is tested on the general domain dataset and the self-built threat intelligence dataset,and the final experiments show that the two models perform well on both types of datasets,which validates the model effect.

Keywords/Search Tags:

Knowledge graph, Threat intelligence, Named entity recognition, Entity relationship extraction

PDF Full Text Request

Related items

1	Research And Application Of Threat Intelligence Knowledge Graph Construction Method For Unstructured Data
2	Research On Knowledge Graph Construction Technology For Cyber Threat Intelligence
3	Research On Key Techniques Of Named Entity Recognition And Relationship Extraction
4	Research On Unstructured Threat Intelligence Entity Extraction Method Based On Machine Learning
5	Research On Key Technologies For Construction And Application Of Threat Intelligence Knowledge Graph
6	Construction And Application Of Knowledge Graph In Financial Field
7	Knowledge Mining Based On Statistical Snowball Models
8	Joint Extraction Of Named Entity Recognition And Entity Relationship Based On Neural Network
9	Research On Construction Of Knowledge Graph For Cyber Threat Intelligence
10	Research And Implementation Of Chinese Entity Relationship Extraction Based On Deep Learning