| The development of natural language processing is upending human life.Named entity recognition is one of the main tasks of natural language processing,and its accuracy plays an important role in newer tasks such as machine translation,recommended systems,and information retrieval.Chinese named entity recognition is very important in the research of named entity recognition due to the specificity of the task and the popularity of the language.In addition,improving the model recognition accuracy usually requires a large number of labeled datasets,and the shortage of high-quality Chinese labeled datasets has become a major factor affecting the performance of the algorithm.Therefore,the research work on high-efficiency Chinese named entity recognition algorithm for a minor amount of labeled data is of great significance and value.This paper takes the Chinese named entity recognition algorithm in the case of a minor amount of labeled data as the research object.Through additional supervision and other methods,it is committed to minimizing the manual labeling cost under the premise of achieving a certain recognition performance.This research work includes:1)In view of the lack of labeled training data in the Chinese named entity recognition task,this paper proposes an automatic labeling method for Chinese entity triggers and a named entity recognition model m-TMN for a minor number of training data sets.The model utilizes the additional supervision of the training data to jointly train sentence vectors and trigger vectors through a trigger matching network.And use the trigger vector as the attention query for the subsequent sequence annotation model.Experiments show that the m-TMN model exceeds the performance of the traditional model Bi LSTM-CRF model with 40% of the training data set with20% of the training data set.And the model outperforms the TMN model in both accuracy and convergence data.2)Aiming at the problem that the classification accuracy and matching accuracy of entity triggers in the m-TMN model trigger matching network training are low,this paper introduces the Dice loss factor into the joint training loss function of the model,and proposes the DM-TMN model.The experimental results show that after improving the joint loss function,the classification accuracy and matching accuracy of entity triggers are improved to different degrees.Furthermore,the DM-TMN model also outperforms the m-TMN model on the same scale training dataset.3)In order to better extract the attention weights of trigger encoding and short-sentence-level encoding to further improve the performance of the model,this paper proposes a GLDM-TMN model that integrates the Global-Local attention mechanism.In the training attention query stage,the model uses the Local attention mechanism and the Global attention mechanism to calculate the weights of the information around the trigger and short-sentence-level text,respectively.The experimental results show that compared with the DM-TMN model,the performance of the GLDM-TMN model on the training dataset of the same proportion has a certain improvement. |