Font Size: a A A

Research On Method Of Retinal OCT Image Report Generation Based On Deep Learning

Posted on:2023-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:C GuoFull Text:PDF
GTID:2544306629477664Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Optical Coherence Tomography(OCT)is a non-invasive,high-resolution 3D imaging technique that has been widely used in the clinical examination of retinal diseases.Ophthalmologists write diagnostic reports for patients by reading their retinal OCT images,which provide objective evidence for further diagnosis and treatment.However,ophthalmologists need a lot of time to read retinal OCT images and write reports,which results in low work efficiency,and their work may be affected by certain subjective factors.Therefore,it is of significance and clinical value to design a reliable report generation technique based on retinal OCT images.Retinal OCT image report generation is a challenging task.Firstly,there are often multiple abnormal symptoms in retinal OCT images,which are difficult to identify.Secondly,different diseases are described in different sentences,so it is difficult to generate reports correctly.To solve the above problems,a multi-task based report generation method for retinal OCT image is proposed in this paper,which combines a multi-label classification task and a report generation task.The main work and contributions are summarized as follows:1.A classification network,named as Category Adaptive Multi-label Classification Network,is proposed to judge whether OCT images are normal and detect 15 abnormal symptoms for abnormal retinal OCT images.In order to solve the problem that visual information cannot be extracted directionally by the convolutional neural network in multilabel classification task,a category adaptive attention module is proposed in this paper.This module uses a learnable query matrix to generate different top-level visual features for different classes,which makes the performance of the multi-label classification network improved.2.A report generation model based on the encoder-decoder framework,which combines visual and semantic information,is proposed to generate diagnostic reports automatically.Firstly,a visual attention module is introduced into the encoder to provide visual information related to symptoms for the decoder.Secondly,a semantic topic generation module is proposed in this paper.The module combines the results of multi-label classification task and semantic attention to provide global and local semantic information related to symptoms for the decoder,which improves the performance of the report generation model.Finally,during report generation,the visual information and semantic information are sent into the decoder,so that the model can generate relevant descriptions for abnormal regions while locating them.A clinical retinal OCT image-report dataset from 18,042 patients,which is obtained from Shantou International Eye Center,is used to train and test the methods proposed in this paper.The multi-label classification network and image report generation model is implemented based on ResNet18.During the testing stage,the value of the average Area Under Curve(AUC)and average accuracy of the multi-label classification network reached 86.08%and 89.38%respectively,and the values of BLEU-1,BLEU-2,BLEU-3,BLEU-4,ROUGE-L,CIDEr of report generation model reached 0.441,0.370,0.335,0.308,0.503,2.571 respectively.Experimental results show that the proposed multi-label classification network and retinal OCT image report generation model have good performance and potential application value.
Keywords/Search Tags:Optical coherence tomography, Medical image processing, Multi-label classification, Medical image report generation, Image captioning
PDF Full Text Request
Related items