Font Size: a A A

Technology And Application Of Knowledge-Driven Multimodal Medical Diagnostic Report Generation

Posted on:2024-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhangFull Text:PDF
GTID:2530306923957069Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Medical diagnostic report is an important basis for doctors to make diagnostic decisions and formulate treatment plans.With the improvement of intelligence level and refinement level of medical equipment,people’s requirements for the efficiency and quality of medical diagnostic report are also increasing.The traditional medical diagnostic report is to read the image manually,analyze the condition,and write the medical diagnostic report.This method not only consumes numerous time and energy,greatly reduces the efficiency of medical treatment,but also is prone to errors in repetitive mechanical work.The quality of report is often limited by the professional knowledge,experience and personal writing style of doctors.With the rapid development of artificial intelligence,artificial intelligence technology-assisted.medical diagnostic and decision-making has become a trend.The generation of medical diagnostic reports based on artificial intelligence technology can not only accelerate the efficiency of medical treatment,provide modern medical diagnostic services for large-scale populations,but also ensure the quality of medical diagnostic report generation and effectively control the quality of medical services.However,there are still many challenges in artificial intelligence technology-assisted medical diagnostic report generation,mainly including:(1)The medical images on which the diagnostic report is based are not clearly differentiated.Medical image has its own specific and abstract.Unlike natural scene images,the disease and abnormal features in medical images are very abstract and difficult to explore.Secondly,the image performance of some diseases is very similar to the normal image performance,the difference is so small that it is difficult to distinguish.(2)The accuracy and completeness of the medical diagnostic report is poor.There are more or less inaccurate reports and lack of information in the diagnostic reports generated by the existing methods,such as missed diagnosis or misdiagnosis of diseases and the lack of description of key information such as disease location in the reports.(3)It is difficult to match the images and texts of medical diagnostic reports.The existing methods fail to align the disease features with the disease focus information,and the actual focus area of the model does not match the real focus location in the image when generating the description statement of the disease,resulting in the problem of image and text mismatch in the generated report,the text description information is inconsistent with the focus area.In response to the above challenges,this thesis proposes a knowledge-driven multimodal medical diagnostic report generation technology,which mainly includes:(1)Aiming at the difficulty of medical image feature mining and extraction,this thesis proposes a CNNTransformer combination model to extract medical image features.This model combines the convolutional neural network with the multi-head self-attention module of Transformer,so that the model can not only capture the local details of the image,but also extract features from a global perspective,which enhances the ability of the model to extract multi-scale and finegrained features of medical images,and achieves accurate analysis of medical images.(2)Aiming at the problem of inaccurate and incomplete medical diagnostic reports,this thesis constructs a graph convolution model based on thoracic medical anatomy knowledge graph and disease knowledge graph.By embedding the knowledge graph as a prior knowledge into the graph convolution network,image features with anatomical position information and disease information are extracted.Integrating image feature extraction model,a knowledge-based CNN-Transformer combination model is proposed for disease detection.(3)Aiming at the problem of report image-text mismatch,this thesis designs a cross-collaboration attention module to align and fuse anatomical information and disease information.Based on the two previous research contents,a multimodal learning based on cross domain knowledge collaboration for image and text consistent report generation method is proposed.The disease tag embedding and image feature are used as coding features together to achieve image-text consistent medical report generation.This article conducts medical diagnostic report generation experiments on the Indiana Thoracic X-ray Dataset(IU-CXR)and the Large Public Thoracic X-ray Imaging Dataset(MIMIC-CXR)respectively.The experimental results show that the models proposed in this thesis is superior to the mainstream baseline models,it can accurately predict the disease,and the generated reports are not only complete,but also consistent with the report style written by doctors.This thesis designs and implements an end-to-end thoracic imaging diagnostic report generation system,validates the above disease detection methods and medical report generation methods,and designs the report data collection and visual analysis function,which is applied to the clinical practice of Qilu Hospital of Shandong University,and achieves good results.
Keywords/Search Tags:Medical diagnostic report generation, Medical image processing, Disease detection, Knowledge graph
PDF Full Text Request
Related items