Font Size: a A A

Research On Entity Relation Extraction In The Field Of Tea Diseases And Pests

Posted on:2022-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q MaoFull Text:PDF
GTID:2518306512453384Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Tea diseases and pests are an important factor restricting the development of the tea industry,and have always attracted the attention of tea farmers.At present,with the development of the Internet,a large number of unstructured or semi-structured texts related to tea diseases and pests have appeared.Traditional search methods cannot efficiently and accurately obtain relevant information in the field of tea diseases and pests.Because the knowledge graph can realize the semantic search of entities,the relation between entities is searched,so it is imperative to construct a knowledge map about the field of tea diseases and pests.As the core step of constructing a knowledge map of tea diseases and pests,relation extraction is mainly to extract the semantic relation between two entities.In this paper,deep neural networks and distantly supervised learning are used to extract entity relation in the field of tea diseases and pests,which lays a foundation for constructing a knowledge map of tea diseases and pests.The main work of this paper focuses on the following three aspects:(1)The corpus in the field of tea diseases and pests is constructed using the method of distantly supervision.First,a small knowledge base in the field of tea diseases and pests was constructed by using the domain knowledge in the form of triples;Secondly,entities are used to crawl out the training text corpus as much as possible to solve the problem of too few training corpus;second,the corpus text is filtered,cleaned and segmented,and then the corpus is annotated by automatically aligning the constructed knowledge base with the processed text;finally,unsupervised training is carried out on the corpus text,and the word vectors related to the corpus text are obtained to better extract the text features.(2)A research method based on the channel attention mechanism for extracting the entity relation of tea pests and diseases is proposed.In the distantly supervision relation extraction task,the PCNN model is often used to extract the semantic features of the sentence.In the extraction process,the sentence is divided into three segments with two entities as the boundary,and the max pooling is performed.The problem is that it cannot be distinguished.A paragraph has a greater contribution to the classification task of the final sentence.In response to this problem,we draw on the channel attention mechanism used in the image field,and assign a different weight to each sentence segment after convolution,so that the key sentence is important for the final relation classification task.The weight of the segment becomes larger,highlighting the influence of key segments in PCNN,and effectively mining sentence features to improve the accuracy of the model.(3)A research method for extracting the relation between tea diseases and pests based on the gated module is proposed.In response to the situation that the sentences contained in the package are all noisy sentences in the corpus,the predecessors proposed to construct a "super bag" on the basis of the package,and use the attention mechanism again,but the effect is still not good.Based on this,this chapter proposes a solution for relation extraction based on a gating module.This method aims to filter the semantic feature vectors of bag through this module before assigning weights through the inter-bag attention mechanism,so that some noise bag are completely filtered Drop.In this way,the interference of noisy sentences on the model is reduced,and the accuracy of the model is further improved.The experimental results show that compared with the traditional relation extraction method,the model adopts two improvement schemes such as the channel attention mechanism and the gating module,and the accuracy,f1 and other indicators are significantly improved.
Keywords/Search Tags:Tea diseases and pests, Distantly supervision, Entity relation extraction, Channel attention mechanism, Gate module
PDF Full Text Request
Related items