Font Size: a A A

Research On Chinese Named Entity Recognition Based On RoBERTa-WWM

Posted on:2024-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiFull Text:PDF
GTID:2568306932999809Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As an important part of the field of artificial intelligence,natural language processing plays a very important role in people’s lives.With the advent of the era of artificial intelligence and the continuous emergence of related technologies related to natural language processing,people’s lifestyles have undergone great changes.Natural language processing can improve people’s work efficiency to varying degrees and bring new benefits to people’s lives.Here comes the convenience.In the field of natural language processing,named entity recognition is a relatively important task,and its application fields are becoming more and more extensive.This paper uses deep learning and reinforcement learning technology to conduct research on Chinese named entity recognition.The specific research contents are as follows:In view of the disadvantage that the internal masking method of the BERT model can only cover up words but not words,the model used in the pre-training stage of this paper is the RoBERTa-WWM model.Since the masking method used by the RoBERTa-WWM model is a fullword mask,it is obtained The word vector can be well combined with context information to make the entity recognition effect better.Compared with the BERT model,the RoBERTa-WWM model is more suitable for Chinese named entity recognition.Therefore,this paper constructs the RoBERTa-WWM-BiGRU-CRF model,whose F1 values in the People’s Daily and MSRA corpora can reach 96.78% and 96.84%,respectively.The experimental results show that the Chinese named entity recognition based on the RoBERTa-WWM-BiGRU-CRF model can achieve the effect of polysemy,which improves the accuracy of Chinese named entity recognition.Aiming at the problem that the RoBERTa-WWM-BiGRU-CRF model is easy to reach the bottleneck in the process of exploring better performance,this paper proposes the RoBERTaWWM-BiGRU-CRF-DRL model based on the RoBERTa-WWM-BiGRU-CRF model.Introduced reinforcement learning,its main advantage is the ability to correct the label results predicted by the model.The experimental results show that,compared with the RoBERTa-WWM-BiGRU-CRF model,the F1 value of the RoBERTa-WWM-BiGRU-CRF-DRL model on the People’s Daily corpus has increased by 1.03%,and the F1 value on the Microsoft Asia Research Institute MSRA corpus an increase of 0.62%.Finally,by observing the experimental results,it is found that the threshold also affects the performance of the model.In order to avoid modifying the correct label to a wrong label,an appropriate threshold should be set during the experiment.In view of the fact that the pure command input method cannot directly and conveniently call the model for Chinese named entities,this paper designs and implements a named entity recognition software for Chinese,and describes the software in detail from four aspects in turn,namely software requirements analysis,Software design,software implementation,and software testing.First of all,it briefly analyzes and lists the specific requirements of Chinese named entity recognition,describes the factors considered in the process of designing this software and the advantages of this software,and introduces the development environment,overall structure and architecture design used by this software,showing the specific functions that the software can realize and the corresponding operation methods,and finally tested the output results of the software.The test results of this software are basically consistent with the expected results at the beginning of the design.In the process of completing the Chinese named entity recognition work,it has relatively high accuracy and stability.
Keywords/Search Tags:Chinese named entity recognition, RoBERTa-WWM, deep learning, reinforcement learning
PDF Full Text Request
Related items