Font Size: a A A

Named Entity Recognition Of Tobacco Pests By Integrating GCN And BERT

Posted on:2024-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:R FengFull Text:PDF
GTID:2543307109999759Subject:Intelligent Manufacturing Technology (Professional Degree)
Abstract/Summary:PDF Full Text Request
Tobacco is often affected by pests and diseases in the process of planting,and the quality of tobacco pest control will directly affect the yield and quality of tobacco.Various pests and diseases encountered by tobacco at different period have corresponding different control methods,and there are many types of pests and diseases,and the control methods are extremely complex and diverse,there are a large number of unstructured data of tobacco pest control scattered in the text,and a large amount of information about control methods has not been refined and summarized,thus affecting and restricting the further improvement of the specialized and efficient level on tobacco diseases and pests prevention and control.It is of great practical significance to extract the corresponding prevention and control information from the corpus of tobacco pests and diseases to help growers better solve the problem of pests and diseases,and the named entity recognition is the most critical basic task.At present,there is no research on named entity recognition in the field of tobacco diseases and pests,and the corpus in this field has the characteristics of entity abbreviation or complex and diverse entity representation,and the method of deep learning is used to identify named entities in the text of tobacco diseases and pests to improve the accuracy of entity recognition.Taking tobacco pests and diseases as the research object,and in view of the problems of long entity characters,long sequence dependence,and entity abbreviation,this thesis proposes a named entity recognition method integrating GCN and BERT model.To begin with,the dataset of tobacco diseases and pests is constructed,which lays a data foundation for this study.Then,the named entity recognition model is optimized and improved,and a named entity recognition method based on BERTBi GRU(GCN)-MHSA-CRF is proposed,which vectorizes the text through BERT model,and introduces a bi-direct gated recurrent unit,a graph convolutional neural network,a multi-head self-attention mechanism and a conditional random field.Furthermore,the generalization,effectiveness and stability of the proposed model are verified by four public datasets.Finally,an online named entity recognition system for tobacco diseases and pests based on B/S architecture is developed,and the recognition results are visualized and displayed,which verifies the feasibility of practical application.The research method of this thesis provides a new idea for the named entity recognition of tobacco diseases and pests,which has certain guiding significance for improving the recognition effect of named entities in the field of tobacco diseases and pests,and provides underlying technical support for the subsequent downstream work such as information extraction,question and answer system,text classification,and knowledge graph construction about tobacco pest control,and helps tobacco growers find the corresponding pest control methods more efficiently and accurately,which has a very wide application prospect and strong practical significance.
Keywords/Search Tags:Named entity recognition, GCN, Deep learning, Tobacco pests and diseases, BERT
PDF Full Text Request
Related items