Font Size: a A A

Research On Entity Relation Extraction Method For Short Text

Posted on:2023-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:K Q WuFull Text:PDF
GTID:2568307118495584Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,the world has entered the Internet era,and a large amount of heterogeneous information is flooded in our daily life.Since the world situation changes,more and more information about scientific and technological security is presented on the Internet in the form of news,blogs and forum.How to obtain hot events from them,to assist situation security experts to analyze important information in hot events,and to evaluate the risk level of hot events,have become an urgent problem to be solved.With the development of natural language processing,entity relation extraction technology is used to deal with the related short sentences related to hot events,which can help experts know relation type of entity pairs and perceive potential risks.However,there are some problems in existing entity relation extraction models,such as lack of context information,poor feature extraction for longer statements and lack of entity tags,which lead to poor relationship classification.To address these problems,this paper takes entity relation extraction into research.In the meantime,this paper systematically analyzes and studies the extraction of context and enhancement on information extraction as well as entity tags issues.The main research contents of this paper are as follows:(1)An adaptive extraction of context information method is proposed,which effectively handle the use and lack of context information.The method includes two research aspects: 1)Using the distance between the entity pairs in sentences,at basic of threshold segmentation method,the length relationship of entity pairs is judged by threshold,with which the corresponding feature vector is assigned to sentence sequence;2)using weight matrix to process the candidate feature vectors,their result are spliced,which is input into relation classification module and extract important information from it during the training process.The experiments show that the adaptive extraction method of context information in this paper can effectively settle missing feature information,and improve the effect of entity relation extraction model as well as about0.85% on F1 compared with mainstream model.(2)An enhanced method of information extraction with Bi-LSTM incorporating attention is proposed,which effectively solve the problem of poor information extraction of the model for longer statements.It mainly includes two aspects of research:1)Aiming at the extraction problem for long distance dependency information of longer sentence,to analyze the advantage and disadvantage of modeling of long sentences with neural network and to build a bidirectional LSTM,help to capture the before and after dependencies of longer sentences;2)for the noise data and weak logic from feature information extracted by Bi-LSTM,self-attention network is used to obtain effective information in feature vector and suppress the influence of noise data,which can improve logic and enhance extracting feature information.The experiments show that the Bi-LSTM fused with attention mechanism in this paper can enhance information extraction and improve the effect of relation classification as well as about1.3% on F1 compared with mainstream model.(3)An efficient relationship extraction method based on multi-feature fusion is proposed,which effectively solves the problem of missing entity tags and inefficient serial extraction of features in the entity relation extraction model.The method includes two research aspects: 1)For the problem of missing entity labels,each entity in the sentence is labeled using natural language processing tools,then it is converted into label index annotation,and the entity label is embedded in the feature embedding layer to obtain the entity label features;2)for the problem of inefficient serial extraction of features,the entity label vectors are input into the feature fusion layer,in which multiple vectors including entity labels are fused and finally fed into relation classification module.The experimental results show that the multi-feature fusion method in this paper can effectively solve the problem of missing entity tags and inefficient serial feature extraction,and play a promoting role in improving the relation extraction effect of the model as well as about 2.1% on F1 compared with the mainstream model.
Keywords/Search Tags:Entity relation extraction, Attention mechanism, Multi-feature fusion, Short text
PDF Full Text Request
Related items