| Relation extraction aims to identify semantic relationships between specific entities from unstructured text data,and is a core technology in the fields of knowledge graph construction,question answering,and personalized recommendation.In a closed domain relationship extraction,the trigger word as a clear judgment about the context of a given relationship type word,has not been fully used,this is mainly due to the artificial with high cost and not feasible to causes such as automatic tagging method.To derive such valuable relation triggers,this paper propose a novel relationship extraction model,RETA,which can be used in Bootstrapping to fuse trigger attention mechanism.First,in the process of constructing the input sequence,in order to prevent the overlapping embedding of entities,the ARG identifier is used to replace the head and tail entity,and then the head and tail entities are spliced to the end of the input sequence,and encoded in an entity-aware manner,and then construct the context vector matrix;Second,use the BERT model to generate the text vector,input it into the MLP to enhance the text representation,and use the attention mechanism to normalize the text representation through the softmax function to get the relationship trigger word start and end the probability distribution of the position.Third,fuse the trigger attention into the relationship prediction,use the weight vector of the start and end of the trigger to calculate the weight sum,get the final feature vector representation,and realize the text content modeling that fuses the attention features of the trigger.Finally,the multi-task learning method is used to optimize the model,use the Cross-Entropy loss as the loss function of each task,and the prediction probability modeling confidence of the trigger word is used to adjust the weight of the loss function,to achieve the optimization.The F1 value in the chapter-level DialogRE dataset is 3.5%,1.9%,14.4%,15% and 13.8% higher than the benchmark model BERT,BERTs,CNNRE,BiLSTM and LSTM,respectively.The F1 values on the sentence-level datasets SemEval2010 task 8 and Re-TACRED show different degrees of improvement over the benchmark models CNNRE,BiLSTM,DepPath and BERT,up to 13.4%.F1 values have exceeded 55% with only 2epoch training on the DialogRE dataset;F1 value has reached 45% by 6 epoch training on the ReTACRED dataset.Experiments show that the RETA model has better extraction performance and high training efficiency,verifying triggers play a crucial role to guide learning.The thesis has 21 figures,16 tables,and 55 references. |