| With the rapid development of Internet technology and artificial intelligence research,network communication and intelligent systems have gradually entered people’s daily life.In life,intelligent question and answer,search engine,and intelligent recommendation are all inseparable from the support of natural language processing technology.Natural language processing technology is an important research direction in the field of artificial intelligence,and the task of relation extraction is one of the core issues in the field of natural language processing.Relational facts organize world knowledge in the form of triples,and these structured facts play an important role in human knowledge and are implicitly or explicitly hidden in the text.Automated and efficient extraction of complete triple knowledge from text is the goal of relation extraction tasks.Joint extraction of entities and relations from sentences as relational triples is crucial for most natural language processing(NLP)tasks,and it has attracted the attention of many researchers.At present,there are still two problems in relation extraction research: First,entities in natural language are often composed of multiple words,which puts forward higher requirements for the model to recognize long-tail entities.Most studies model based on the assumption that individual words form entities,which leads to incompleteness of triples extracted by the model.Another problem is that due to the complexity of natural language,the entities in the text often contain more complex semantic relationships,and there are overlapping entities or entity pairs.However,most of the existing models need to be improved in the extraction of overlapping triples.In this paper,relation extraction is regarded as a sequence labeling problem,and it is divided into two steps: head entity labeling and tail entity and relation labeling.In addition,in view of the two key issues raised above,this paper proposes a relation extraction based on relation-level attention model.In terms of the long-tail entity problem,this paper designs a two-pointer network to determine the boundaries of potential entities,and matches different head entities according to the proximity principle.With regard to the overlapping triplet problem,this paper utilizes an attention mechanism and only focuses on information that is beneficial for the task of extracting relations.For the first time,the model employs relational embeddings to calculate the similarity between each relation type and the word at the corresponding position in the sentence,which can efficiently and accurately detect the correct overlapping triples.Extensive experiments on two benchmark datasets,NYT(The New York Times Corpus)and Web NLG,show that our proposed framework outperforms previous methods and achieves significant improvements in F1-score.After sufficient experimental analysis,we point out two areas for improvement in the current model,and design improvement methods for these two aspects.To improve the accuracy of entity recognition,we propose to use a standard span-based model for head entity extraction.For the problem of sentence representation in different contexts,we adopt a multi-head self-attention mechanism to learn the specific representation of text in different subspaces.Experiments on the NYT and Web NLg datasets show that the improved model is more accurate and comprehensive in entity extraction,and achieves higher F1 values on both datasets. |