| With the popularization of Internet technology and the development of big data disciplines,the mass data in the network are appearing more and more in the form of text.How to make the computer understand the text content and process it automatically to reduce the labor intensive and improve the efficiency has become an important issue in today's Natural Language Processing.Relation extraction is a very important link in information extraction.The key task of this link is to realize the semantic relationship between entities automatically and form a relationship three tuples.Relation extraction has broad application space and plays an important role in large-scale knowledge base construction,sentiment analysis,automatic question answering system and other fields.Traditional relation extraction needs to aim at some specific professional fields,so it is difficult to transplant.In recent years,with the introduction of deep learning into the field of Natural Language Processing,a large number of researchers have begun to use different network processing relations to extract tasks.While solving the problems of traditional methods,it also produces other difficult problems such as the cost of manual labelling.In this paper,the distance supervised relation extraction method is adopted to effectively solve the problems in the supervised relation extraction,but due to the need to be extended from a small number of tagged corpus,the noise will be generated.At the same time,the weak supervised relation extraction method also produces the problem of NA noise and class data imbalance.Therefore,this paper gives solutions:(1)First,it introduces the application background of entity relation extraction and the development of current situation.Then it introduces the theoretical and technical basis of relation extraction,such as the related theory of convolution neural network.From the angle of machine learning,the problem of relation extraction is divided into unsupervised,supervised and distance supervision methods.This paper analyzes the mainstream method of relation extraction at the present stage.(2)The relation extraction method of distance supervised can produce the problem of the back mark noise and the NA noise.The former researchers usually use the method of multi-instance learning to solve the problem.For the NA noise problem,this paper use the ranking loss to solve the NA noise question.(3)From the statistics of the back mark data,it is found that there is an imbalance in the relation category distribution,which has a negative impact on the model training.We add the cost-sensitive to the original ranking loss function,to a certain extent,to solve the influence of the data imbalance,thus improving the accuracy of the relationship extraction. |