Font Size: a A A

Research On Agricultural Named Entity Recognition Method Based On XLNet

Posted on:2024-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:F GuFull Text:PDF
GTID:2543307139456334Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advancement of agricultural informatization and the rapid development of natural language processing related technologies,more and more grassroots farmers and related practitioners ask questions and acquire knowledge through the Internet.Named entity recognition in agricultural texts is one of the foundations for other tasks.It can identify entities and obtain relevant information from various unstructured questionanswer data.Agricultural named entity recognition technology can help practitioners quickly find the required information in massive agricultural literature,such as identifying specific crop diseases,suitable pesticides and fertilizers,etc.,so as to solve various problems in agricultural production more efficiently.In addition,agricultural named entity recognition technology can also assist research work in related fields,such as providing support for the construction of knowledge maps in the agricultural field and the construction of domain expert think tanks,to further promote the development and innovation of the agricultural field.Therefore,the use of agricultural named entity recognition technology in the field of agriculture has important practical significance.At present,the development of named entity recognition technology in the agricultural field is still in its infancy.Although some research work has involved named entity recognition in this field,compared with other fields,there are still some challenges and difficulties in named entity recognition in the agricultural field,such as the lack of corpus.,the diversity and complexity of domain terms,and the ambiguity of the same entity name.Based on the above problems,this paper aims to study the content of named entity recognition in the agricultural field.The specific research contents are as follows:(1)At present,there is a lack of mature named entity recognition data sets in the agricultural field.In response to this situation,this paper uses journal documents and network texts as data sources to organize and collect text corpus in the agricultural field,sort out common entity categories in the agricultural field,and analyze After data preprocessing and cleaning of the basic text,semi-automatic labeling was carried out with labeling tools,and manual review and verification was performed in the later stage.An agricultural field labeling corpus containing 20,835 entities was created.(2)Based on the XLNet pre-training model,the commonly used BERT model and Bi-LSTM model have been replaced.XLNet is a pre-training technology that combines a permutation language model.Unlike other traditional pre-training models,it has absorbed a large number of the sequence information of lexical elements has stronger coding ability,can more fully capture the semantic information of the text,and alleviate the polysemy problem of a word.XLNet also uses the Transformer-XL model to enhance the long-distance dependency capture ability of the model.Better obtain the association between long text entities,and input the word vector representation to iterative expansion convolutional network(IDCNN)for context encoding,while making full use of GPU capabilities,while improving accuracy and efficiency,and finally through The conditional random field identifies the label information and outputs the optimal sequence.(3)Experiment the constructed XLNet-IDCNN-CRF model on the constructed agricultural corpus,and compare it with many other mainstream models in terms of performance and efficiency.The results show that the accuracy of entity recognition of the model in terms of performance is,the recall rate and F1 value are better than other models;in terms of efficiency,in the case of little difference in convergence speed,the time of single iteration is also the shortest among the multi-models,and the efficiency of this model also shows certain advantages.Thus confirming the effectiveness of the model.At the same time,the characteristics of the model itself are analyzed for the convenience of future work.This research has effectively improved the performance of named entity recognition tasks in the agricultural field,but there are still directions that can be improved and looked forward to.In the future,related research can be further strengthened to improve the accuracy and efficiency of named entity recognition,and provide more accurate information for agricultural practitioners,more comprehensive information support to help them better solve various problems in production.At the same time,cross-integration with other technologies and fields related to the agricultural field is also required to better explore the value and application prospects of data in the agricultural field.
Keywords/Search Tags:Agricultural named entity recognition, XLNet, iteratively dilated convolutional network, conditional random field, deep learning
PDF Full Text Request
Related items