Font Size: a A A

Research And Development Of Relation Mining System For News Sites

Posted on:2021-07-12Degree:MasterType:Thesis
Country:ChinaCandidate:S Z SuFull Text:PDF
GTID:2518306050484514Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Knowledge graph is an important means to organize and manage data.It has been widely used in search engine,question answering system and recommendation system,and is empowering all walks of life.Although the accuracy of artificial knowledge graph is very high,its efficiency is very low and it costs a lot of manpower,material and financial resources.How to construct knowledge graph automatically and efficiently is the core problem! Relation extraction technology is an effective way to solve this problem.Relation extraction is to automatically mine relation triples from text.The text on the Internet is vast,but many texts are not standardized.The purpose of the paper is to automatically and efficiently mine a large number of relational triples from the more standardized news texts,so as to expand the existing open source knowledge graph.Based on the traditional machine learning technology of relation extraction,it needs to design a large number of features artificially,which is time-consuming and laborious.Moreover,errors caused by the upstream natural language processing tools will cause error propagation.Deep learning can automatically learn effective features for current tasks from text data.In this paper,we study the algorithm of relation extraction based on deep learning,design and implement a relation mining system for news websites.This paper first describes the significance of the research of relationship extraction and the current situation of research at home and abroad,then introduces the basic principle of deep learning,as well as the convolution neural network and long short-term memory network,which are commonly used in natural language processing.The word embedding technology of natural language processing is also briefly introduced.This paper introduces the differences between the difficulties of Chinese and English relationship extraction in detail.Aiming at the defects of neural network in Chinese and English relationship extraction,a corresponding transformer-based relationship extraction model is designed for Chinese and English.The main contributions of this paper are as follows:1.Aiming at the problem that convolutional neural network can’t capture long-distance features in relation extraction and that long short-term memory network can’t be parallelized,according to the characteristics of Chinese relation extraction and English relation extraction,the relation extraction algorithms that transformer as the main encoder are designed respectively.2.To solve the problem of insufficient tagged corpus,a semi supervised learning method is proposed to train the neural network model iteratively.Constantly grab the news text from the network,use the existing model to extract the relationship triples from the news text,and fill the news text with high prediction into the training set,so as to expand the scale of the training set and improve the performance of the model3.The adversarial training technology is adapted to the Chinese and English relation extraction model.The adversarial training constructs adversarial samples by adding weak random noise to the input data,and the model is trained on the adversarial samples.Adversarial training technology not only improves the model effect,but also enhances the robustness of the model and effectively reduces the noise problem brought about by semi-supervised learning.4.According to the research results of Chinese relation extraction and English relation extraction,a relation mining system for news websites is designed and implemented.In addition,the system supports the extraction of the special relationships: hyponymy.
Keywords/Search Tags:Relation Extraction, Knowledge Graph, Transformer, Adversarial Training
PDF Full Text Request
Related items