| With the rapid development of smart technology and the Internet,efficient extraction of valuable information from text and utilization of such text data have become the key to research.Named Entity Recognition(NER)is mainly used to process text data and identify entities with strong denotations in unstructured text.However,it is found that the process of Chinese entity recognition is affected by the ambiguity of entity boundaries,polysemy and polyphony,and more non-entities are easily affected by noise,and the feature information extracted by a single neural network model is locally unstable,all of these factors affect the recognition effect of Chinese entities.Therefore,two Chinese fine-grained entity recognition models are proposed in this paper,and the work is centered on the following two parts:(1)To address the problems of blurred entity boundaries,polysemantic words and polyphonic words in the entity recognition process,which lead to the inability to characterize the word information well,and the impact of noise caused by non-entities,this paper proposes a fine-grained entity recognition model based on Chinese pretraining and virtual adversarial training.Firstly,a rich semantic vector fusing pinyin vector,glyph vector,word vector and position vector is generated by Chinese BERT,a Chinese pre-training model,and the adversarial samples are generated by inputting adversarial perturbations of the same size on the semantic vector representation,calculating the losses of the adversarial samples and the original samples respectively,summing them and inputting them into a multilayer perceptron MLP and a bidirectional long and short term memory network Bi LSTM,and inputting the generated The generated global vectors are input to the conditional random field CRF for decoding,and the global optimized prediction label sequences are obtained to achieve the Chinese fine-grained named entity recognition task with high accuracy.(2)In order to fully extract feature information of text,this paper proposes a finegrained entity recognition model based on recurrent expanded convolutional and fusion neural networks.The recurrent expanded convolutional neural network IDCN is incorporated into the Chinese fine-grained named entity recognition model based on Chinese pre-training and virtual adversarial training.With the advantage of IDCNN extracting local features of the learned text,the fused global features are extracted by combining the multilayer perceptron MLP and the bi-directional long and short term memory network Bi LSTM,and the feature information of the fused global and local features is input to the conditional random The results are decoded and labeled in the airport CRF,which can effectively capture the dependencies between the labels and obtain a comprehensive label sequence.The named entity recognition model proposed in this paper was conducted on CLUENER 2020 dataset,Weibo NER dataset and Resume dataset for comparison experiments and ablation experiments respectively,and the experimental results show that both models proposed in this paper can effectively improve the effect of named entity recognition and verify the effectiveness of the model. |