Font Size: a A A

Chinese Military Named Entity Recognition Using Multi-neural Network Collaboration

Posted on:2021-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:X Z YinFull Text:PDF
GTID:2416330620468122Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,social media plays an important role in people's daily life.The opensource military information obtained from large-scale social texts has become an important information source for military research,military trend prediction and other military tasks.Military named entity recognition refers to the identification of weapon,military facility and other military field entity types from the text,which is the basic and key task of military intelligence generation,military knowledge map construction and other research.The research of military entity recognition in social text faces many problems and challenges,such as the lack of open corpus and entity classification standard,unclear entity boundary,nonstandard expression of military term representation in social media,insufficient distributed expression of words,single interest model,weak generalization ability of model.Facing above problems and challenges,A Chinese named entity recognition method is proposed based on multi-neural network collaboration.The main contributions are as follows:(1)The rules of entity tagging which considering the fuzzy boundary of entities and the standards of entity classification in military field are designed.And the methods of corpus tagging and corpus quality enhancement based on arbitration mode,which improves the accuracy of corpus tagging is proposed.Military Corpus,which includes 15,317 microblogs and 20,388 sentences,including 8 entity categories is constructed to solve the problem of the lack of public corpora in military field and lay the foundation for entity recognition.(2)A multi-neural network collaborative entity recognition model in military field is proposed.The word vector expression module is based on BERT(Bidirectional Encoder Representations from Transformer),which combines the word features,sentences features and position features of corpus to generate word vectors to solve the problem of insufficient distributed expression of words.The context feature extraction module is based on Bi LSTM(Bi-directional Long Short Term Memory)to realize the further context feature extraction of word vectors.The global optimal tag sequence is obtained based on CRF(Conditional Random Field)encode module.The experiment results show that the F-score and recall of the model are 18.65% and 28.48% higher than those of CRF-based model,8.69% and 13.91% higher than those of Bi LSTM-CRF-based model,5.15% and 7.08% higher than those of CNN(Convolutional Neural Network,CNN)-Bi LSTM-CRFbased model.(3)Active Learning Military Named Entity Recognition(ALMNER)method is proposed to improve the effectiveness of the military domain entity recognition model and the model's generalization ability.The sample selection algorithm based on sample confidence and sample balance is proposed to delete the samples without military domain entities,and balance the entity categories in the sample set.The experiments are conducted to compare AMLNER method,the supervised learning entity recognition method and the method based on random sampling.The result shows that the F-score of ALMNER is 0.48% higher than that of the supervised learning method,and 3.41% higher than that of the method based on random sampling.When identifying military event entities,the F-score of ALMNER is 7.56% higher than that of the supervised learning method,and 7.61% higher than that of the method based on random sampling.When identifying military facility entities,the F-score of ALMNER is 3.13% higher than that of the supervised learning method,and 8% higher than that of the method based on random sampling.When identifying military rank entities,the F-score of ALMNER is 3% higher than that of the supervised learning method,and 4.91% higher than that of the method based on random sampling.
Keywords/Search Tags:Named entity recognition, Military, Social text, Chinese character embedding representation, Multi-neural network, Active learning
PDF Full Text Request
Related items