Font Size: a A A

Research On Attribute Labeling Method For Named Entities Of Network Questioning Text

Posted on:2024-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y SunFull Text:PDF
GTID:2544306938456434Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the popularization of Internet technology,health and disease-related consultation on the Internet has become a common phenomenon in people’s life.Many common and mild disease-related problems can be effectively solved in online consultation.After information processing,important entities and relationships can be extracted to obtain preliminary outpatient medical records,as well as simple diagnosis and reasoning process.A large number of dialog texts are between doctors and patients.The dialog information of patient consultation and doctor answer generated in the process of inquiry provides valuable real data for artificial intelligence assisted diagnosis,automatic updating of knowledge base and other application scenarios.This study aims to explore the method of automatic annotation of named entities in real-world doctor-patient dialog texts on the Internet.This study is a study of the classification of diagnosis related entities in real-world doctor-patient dialogue texts.By sorting out and summarizing different types of medical texts,the characteristics of different types of data sets are sorted out;Medical entities were extracted from the dialog,and then the medical entities associated with the final diagnosis were further classified.This paper proposes a RWT(RoRERTa-WWM-ext+TEXT CNN)model to automatically classify and annotate named entity attributes.The research on the automatic annotation of the real doctor-patient dialog text on the Internet uses the bidirectional language representation model to obtain the word vector with contextual location annotation,and then uses the neural network to realize the prediction of classification labels,that is,according to the characteristics of the bidirectional language representation model,select different levels to splice the model,and finally obtain a new model system of multi feature fusion.In order to better illustrate the advantages of RWT,this study selects different bi-directional linguistic representation bases for comparative experiments,including Bert,bert-wwm model and Chinese pre training model Ernie.Results:according to the scoring results,the total score of Ernie model trained with a large number of Chinese corpora was 12.60 points higher than that of bert-base.Bert-wwm surpassed Ernie with a score of 69.28.The RWT model adopted improved the total score by 1.28 points.The full word mask is preferred,and the unit is used to replace the word mask to classify and label medical entities,which can greatly improve the accuracy of the original model.Better results can be obtained by effectively optimizing the downstream tasks of the model and integrating multiple models in the later stage.Through the automatic annotation of named entity attributes of the real doctor-patient dialogue text generated in the process of consultation,we can further optimize the medical information in the future and pave the way for the automatic generation of medical cases.
Keywords/Search Tags:Named entity, Network consultation, Attribute annotation, Multi-feature fusion mode
PDF Full Text Request
Related items