| With the continuous construction of medical informatization in our country and the rapid development of information technology,a large number of Chinese medical texts constantly emerging,including clinical electronic medical records and medical professional literature,the textual data has become necessary resources to promote the development of our country’s medicine.As a key technology in data mining in medical fields,Chinese medical named entity recognition plays a decisive role in downstream tasks such as relationship extraction between medical entities and medical text classification.The complex Chinese language expression and medical professional characteristics make Chinese medical named entities have a long span and the entities is more professional,most entities have nested structures.Named entity recognition methods for general domains cannot be directly used for the recognition of Chinese medical professional entities.Existing related research only focuses on the recognition of medical entities in English text and flat structures.In order to solve the above problems,improve the accuracy of Chinese medical named entity recognition,and further promote the development of Chinese medicine,the main research contents of this paper are as follows:1.The current nested named entity recognition method based on the pointer network,the candidate entity is obtained by matching the characters which positional labels are "B" and "E",which loses the constraints of the early named entity recognition method on the character positional label Characteristics;the importance of different characters in the span of Chinese medical named entities is different.Aiming at these problems,through the research on entity recognition methods in professional fields,combined with the characteristics of Chinese medical entities,a cascading recognition method for Chinese medical named entities is proposed.Firstly,the positional label of the character is detected by the sequence labeling method,and then the position information of the character is used to guide the generation of candidate entities,and the entity semantic classification is carried out.Embedded the positional label of each character element relative to the entity into the model,and combine the importance of different elements within the span of Chinese medical entities for entity fusion representation.2.The conceptual representation model in the general field cannot fully represent the professional semantics of Chinese medical texts;The traditional entity recognition method uses two different modules to identify the beginning and end of the entity respectively,and does not pay attention to the overall characteristics of the entity span;the current mainstream nested entity recognition method usually recognizes the entities of each layer without distinction,ignoring the influence of the overlap between the inner and outer entities on the model recognition effect.Aiming at these problems,combined with the conceptual representation model in the professional field of Chinese medicine,a branch-structured entity recognition method is proposed,which divides the process of named entity recognition in Chinese medicine into textual conceptual representation,candidate entity span extraction,and entity semantic classification.Transferred normalized design in reading comprehension technology,treating the beginning and end of the entity as a whole to detect;uses the branch structure to identify each layer of Chinese medical entities in the nested structure in parallel. |