Font Size: a A A

Study On Extraction Of Lesion Presentation For Understanding Textual Nuclear Medicine Diagnostic Reports

Posted on:2022-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhangFull Text:PDF
GTID:2504306485459374Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of medical information construction,major hospitals will generate a large amount of medical data when providing services to patients.SPECT diagnosis text is a very important medical text data in the field of nuclear medicine.It records the diagnosis results of the doctor on the patient’s condition based on the SPECT instrument,and contains this rich medical knowledge.Therefore,in practical applications,the realization of the extraction of medical text data is helpful to establish an auxiliary diagnosis model and provide support for the defense of bone tumor diseases in my country.In academic research,medical text extraction helps to build a corpus of lesion entities and provides conditions for automatic annotation.At present,there are mature methods for data extraction in the medical text field,but the research on the extraction of SPECT text lesion characterization is still in its infancy.In order to achieve the problem of lesion characterization extraction,the main work of this article is as follows:(1)In order to improve the effect of lesion extraction,a data enhancement algorithm Calbert based on conditional mask model was proposed to solve the problem of sparse text data set of SPECT diagnosis.Firstly,the original data were cleaned according to the characteristics of SPECT diagnosis text,and the diagnosis description data and the diagnosis result data were extracted.Secondly,the "BIO" annotation method is used to annotate the data to form the annotated corpus.Finally,a control experiment was set up to apply the data based on the conditional mask model to the Bert and Albert models and compare them with the unmodified model.The accuracy rate,recall rate and F1 value of CBERT are 95.68%,93.23% and 94.41%,respectively.The experimental results show that the data enhancement algorithm based on conditional mask model can effectively expand the data set and prepare for the effective data extraction.(2)At present,extraction methods generally have mature applications in general fields.However,due to the large amount of medical background knowledge in professional fields,traditional models cannot efficiently extract these characterization information.Aiming at the problem of poor model extraction performance in professional fields,this thesis proposes a domain-based pre-training algorithm ReALBERT,and constructs a Re-CALBERT-Bi LSTM-CRF model framework based on this algorithm.First,insert the enhanced data into the Bi LSTM layer for use.Secondly,load the unlabeled original SPECT text into the ALBERT pre-training model,and use the Futher Training method to train it again.Subsequently,the annotation data will continue to be used on the model.Finally,in order to verify the effectiveness of the method,a control group was set up.The experimental results showed that the accuracy,recall,and F1 value of the Re-CALBERT-Bi LSTM-CRF method performed the best,which were 95.66%,94.85%,and 95.23%,respectively.Therefore,based on the extraction result,the key-value matching of the "location" attribute and the five types of attributes "shape,degree,state,disease,suggestion" is realized through regularization.(3)Based on the above processing methods,this thesis designs and implements a SPECT lesion characterization generation system,which mainly realizes the extraction of lesion characterization information,and stores the extracted characterization information in the background database to construct a nuclear medicine characterization database.
Keywords/Search Tags:SPECT imaging, Diagnostic reports, Lesion presentation, Deep leering, ALBERT
PDF Full Text Request
Related items