Font Size: a A A

Research On The Extraction Of Weibo Character Relations Based On Deep Learning And Relationship Path

Posted on:2020-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:T L XuFull Text:PDF
GTID:2428330578483309Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of computer and network communication technology,the rapid development of Internet technology and Web3.0 mode iterative update,the emergence of various social media platforms can make people more convenient and quick access to information,such as sina weibo,tencent weibo,Facebook,Twitter and so on.These social media platforms produce a large number of network,every day a lot of information is contained in the text.Information extraction is one of the natural language processing tasks that can automatically extract structured information from these unstructured texts.The extracted structured information,such as entity triples of "(jack ma,founder,alibaba)",can be used as the knowledge source to build a large-scale knowledge base.At present,the relationship between characters,as the subject of knowledge,has an important impact on the generation and dissemination of information.Researchers have gradually begun to pay attention to the role of character resources in the construction of knowledge base.Therefore,it has become a research hotspot to explore the relationship between characters.However,traditional relational extraction methods often rely on a large number of feature engineering and natural language processing tools.Deep learning has been successfully applied in the field of natural language processing and solved the problems existing in traditional methods.Therefore,this paper studies the extraction of the relationship between micro-blog characters based on deep learning.The main work includes the following aspects:First of all,this paper constructs a micro-blog character relationship extraction model.In this model,based on convolutional neural network(CNN),a bi-directional long-short time memory network(BiLSTM)coding layer is added to form a mixed neural network.BiLSTM is used to encode the context information of input words to enhance the ability of CNN,which can extract more effective text semantic features and the ability of processing large spacing entity text.Finally,an improved classifier is used to customize the objective function,and the training process of the model is optimized.Then,in order to solve the problem of using remote supervision to automatically annotate the data set,this paper converts the input from the sentence level to the packet level,and then constructs a relational path encoder based on the relational path between the entities in the microblog text.The encoder is used to measure the relationship probability of a given relationship path inference chain in the text,which is combined with the previous extraction model to calculate the relationship type that best represents a sample package.The experimental part is based on remote supervision combined with the Chinese entity triples provided by the external knowledge base to align the crawled data and generates subsequent research data.In order to verify the validity of the model proposed in this paper,relational extraction models based on different methods were selected and several comparative experiments were designed.Experiments show that,compared with the traditional relational extraction models of feature engineering,the model proposed in this paper has better effect under the same experimental environment,and the model has better generalization ability compared with the currently popular relational extraction model based on CNN and LSTM.
Keywords/Search Tags:Micro-blog, Deep-learning, Relation Extraction, LSTM, CNN
PDF Full Text Request
Related items