| In the context of the era of big data,biomedical research is rapidly evolving,and a large amount of literature is increasing every year.As a huge unstructured database,the massive biomedical literature provides a wealth of biomedical research knowledge and it is the most important resource in the field of biomedicine.Therefore,more and more attention has been paid to how to quickly acquire professional knowledge from these the massive literature.Biomedical text mining technology,which named entity recognition,plays an important role in automatic acquisition of text knowledge,and as one of the tasks of this technology,it aims to identify specific types of names from biomedical literature,such as proteins,DNA,RNA,cells and etc.It provides a prerequisite for further extraction of relationships and other potential information.The research work of this paper consists of the following three parts:(1)Biomedical named entity recognition based on conditional random field.Using biomedical corpus,15 kinds of features are designed according to the characteristics of biological entities.The conditional random field algorithm training model is used to select the best feature set combined with the single optimal combination method.The influence of each feature on the experimental results is analyzed.After testing and evaluation,the comprehensive evaluation value F can reach up to 75.91%.(2)Biomedical named entity recognition based on bi-directional long-term and short-term memory network combined conditional random field.Traditional machine learning algorithms not only need to select features manually,but also need certain domain knowledge.At the same time,the quality of the model depends on the high-quality data set and the optimal feature set,which requires a lot of manpower costs.In order to solve the problems of traditional methods,this chapter proposes a named entity recognition method based on bidirectional long-term and short-term memory network combined with conditional random field.After training,testing and evaluation,the F value reaches 76.81%.The experimental results show that this method not only does not need to extract features manually,but also has better prediction effect than the one-way and two-way long-term memory network and traditional machine learning algorithm.(3)Design and implementation of biomedical named entity recognition system.The model trained by the bi-directional long-term and short-term memory network combined with conditional random field algorithm is used to retrieve the related literature with autism as the key word for entity recognition and visually display the data,which shows the effectiveness and practicability of the algorithm.The method of named entity recognition presented in this paper shows good recognition effect,which can quickly and automatically identify the entity name from the massive biomedical literature,thus laying a foundation for the entity relationship extraction. |