| With the rapid growth of Internet data volume,efficient and accurate extraction of valuable information from massive unstructured text has become an urgent need.Named Entity Recognition is an important basic task in the field of Natural Language Processing.Its main purpose is to extract entities with practical meaning from text,such as names of people,places and organizations,and it plays a key role in many applications such as relationship extraction,information retrieval and knowledge graph.Since there is no natural segmentation in Chinese text,word-based approaches will inevitably introduce segmentation errors,and character-based approaches improve the performance but lose a lot of information.Therefore,in this paper,we address the above problems and conduct experiments based on four widely used public datasets in the field,and propose a Multi-Semantic Features Model(MSFM),which works as follows:(1)To address the problem of fews feature in the input representation layer,we propose to use various auxiliary resources as additional features to enrich the input representation.(1)The character-based method cannot fully utilize word information,so word information is introduced through the Soft Lexicon method,and for the problem that unregistered words easily appear on small data sets,it is proposed to use pre-trained lexicon to count word frequency as the weight of word information fusion;(2)Bichar is a combination of the current character and the next character,and its vector representation is obtained by Word2 vec method,and the bichar vector is concatenated together with the character vector to enrich the character-based model with bichar information;(3)To address the problem of polysemy in Word2 vec,we propose to use Ro BERTa pre-trained model to obtain dynamic vectors containing contextual information for each character,and it uses a two-way Transformer to extract contextual semantic features,which can effectively improve the performance of downstream tasks.(2)In named entity recognition tasks,LSTM tends to achieve better results compared to other contextual encoders.However,LSTM is prone to overfitting problems when the dimensionality of the vector of the input representation is too high.In this paper,the sequence modeling layer is implemented using a single-layer bidirectional gated recurrent unit(Bi-GRU),which has fewer parameters and faster fitting speed compared with LSTM,and can better capture the contextual dependencies in high-dimensional vector representations.(3)Incorporating multiple features in the input representation layer can effectively improve the performance of the model.To prevent overfitting problems,we introduce Dropout layers for regularization,and more importantly,we conduct extensive experiments on the number of Dropout layers and their positions in the model to find the best settings to further optimize the model performance.Meanwhile,the relative contribution of each component in the model is evaluated by a detailed ablation study.The experimental results show that the proposed model in this paper outperforms other comparative models in terms of performance and inference speed,and has the characteristics of general structure and high migration.In addition,the model has a low training cost and is suitable for research and industrial applications. |