Knowledge Graph is a graph-based data structure for describing and representing the relationships between entities.It represents real-world things(entities)and relationships between things(entity relationships)as nodes and edges of a graph to facilitate computer understanding and analysis.Knowledge graphs are widely used in search engines,intelligent question and answer,natural language processing,machine learning and other fields to help computers better understand human language and reasoning thinking.Entity relationship extraction is an important part of building knowledge graphs,which aims to automatically extract entities and relationships between entities from natural language text to provide data support for the construction of knowledge graphs.Therefore,entity relationship extraction is one of the key techniques in natural language processing and knowledge graph construction.The mainstream methods of entity-relationship extraction can be divided into Pipeline method,which treats entity identification and relationship extraction as two independent subtasks,and completes entity identification first and then relationship extraction;and joint extraction method,which treats entity identification and relationship extraction as a whole task,and completes entity identification and relationship extraction simultaneously.In this paper,we propose an entity extraction model based on global pointer network with potential relationship embedding.Firstly,the input text is encoded using BERT and global pointers to entities are constructed to extract subjects and objects.Meanwhile,the encoded text is pooled on average to predict the possible relationship labels in the text.Then,the label information is feature fused with the text encoding to construct the global pointer of subject-object relationship,which indicates the head and tail of the subject and object where the relationship exists.The relationship triad existing in the text is generated by the joint judgment of entity global pointer and relationship global pointer.The model is trained by introducing the adversarial training method to improve the robustness of the model.The experimental results show that the F1 values of the model after training by adversarial training are 0.4% and 2.7% higher than those of the current mainstream global pointer network TPLinker and cascaded binary token Cas Rel,respectively,on the NYT dataset,and 0.6% and 0.7% higher than those of the two models,respectively,on the Web NLG dataset.The proposed model is distilled using the "teacher-student" framework to address the problems of large pre-trained model parameters and difficult deployment.The proposed model is used as the teacher model,and the LSTM-based model and the 4-layer BERT model are used as the student model for knowledge transfer.The experimental results show that the model with knowledge distillation is better than the model without knowledge distillation,and the performance of the shallow BERT student model is only 2.2% and 2.0% different from the teacher model,but the number of parameters is less than half of the teacher model and the inference is faster. |