Font Size: a A A

Study On Multi-task Image Understanding Model For Endoscopy With Intelligent Diagnosis

Posted on:2023-10-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:T YuFull Text:PDF
GTID:1524306836954609Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Gastrointestinal(GI)endoscopy is currently the standard screening method of choice for gastric diseases.The examining physician enters through the human body with a long,thin,flexible tube-like device and slowly pushes it to the target area of the organ.The internal surface of the organ is imaged in real time through optical lenses and image sensors,thus diagnosing the health of the GI mucosa in the video images based on clinical knowledge and experience.To assist clinicians in the early screening and diagnosis of GI cancer-related lesions and reduce the occurrence of missed and misdiagnosed cases,researchers have developed computer-aided detection and diagnosis systems based on image understanding technology.They help clinicians locate lesions and assess risk in real time,assist in improving clinical lesion detection,and standardizes the examination procedure.Currently,medical image understanding technology with deep learning algorithms is developing rapidly,excelling in tasks such as classification,detection,tracking,segmentation,registration and generation of images and videos,and obtaining diagnostic accuracy comparable to that of clinical experts on some lesion identification tasks and data.However,in the field of research and application deployment of intelligent diagnostic methods for GI endoscopy based on deep learning,existing works have mostly focused on isolated research of single-task models,which is difficult to meet the extensive demand for full-featured intelligent computer-aided diagnosis in sophisticated real clinical examination environments,especially as the number of isolated tasks continues to grow and the real-time requirements necessary for clinical examination are difficult to be met,hindering intelligent diagnosis in the field of endoscopy.In this paper,we propose to investigate an end-to-end multi-task image understanding model for intelligent diagnosis of GI endoscopy using a multi-task learning model.The model is trained and deployed based on multiple clinical datasets to achieve multi-task inference capability in terms of time performance and diagnostic efficacy.Through a theoretical and methodological study of the problems and limitations of deep learning for intelligent GI endoscopy diagnosis,this paper proposes a content and time-series decoupling method in multi-task learning mode and a multi-task integration model design scheme based on task relationship analysis,and conducts clinical practice and validation of the constructed real-time GI endoscopy detection and diagnosis system.Specifically,the main research of this paper includes:1)Research on content decoupling modeling methodThis paper proposes a model training method to alleviate the problem of content coupling in the training process of single-task models,where the features of the model are coupled to other irrelevant features due to the weak feature representation of the corresponding task,resulting in reduced system generalization and model reliability.Based on the prior knowledge that the coupling tasks are irrelevant,the method achieves the content decoupling of irrelevant tasks and ensures the accuracy of the corresponding task association relations in the subsequent multi-task correlation analysis by adopting the partial jigsaw method to enhance the data during the model training process,and introducing a new loss function based on the class activation map to correctly guide the model feature coupling ability.Taking the gastric precancerous lesion recognition model in the GI endoscopy scenario as an example,the content decoupling model approach in this paper effectively overcomes the problem of coupling the precancerous lesion recognition features to the anatomical location recognition features,and improves the accuracy of the model by 1.20% and 0.84% in lesion recognition,while significantly improving the model feature consistency in the visualization class activation map.2)Research on time-series decoupling modeling methodTo address the problem of time-series coupling in multi-task learning based on single-task isolated models,multiple tasks have temporal serial relationship in the computational process,which leads to redundancy in computation and degradation of real-time performance of the system,this paper proposes a model construction and training method to alleviate the time-series coupling problem.The method is based on the prior knowledge related to the time-series coupling task,and by fusing and adjusting the model structure and proposing an improved triplet metric learning method to improve the feature specificity of the task.The time-series decoupling of the relevant characters is achieved,which improves the temporal performance of the model and ensures the feasibility of the parallel output of the subsequent multi-task integration model.Taking the object detection and object tracking of colonoscopic polyps in the GI endoscopy scenario as an example,this paper achieves a 30% improvement in model inference speed without loss of model detection and tracking accuracy by decoupling the object detection task from the object tracking task notationally in time.3)Research on multi-task integration model approach based on task relations analysisTo solve the problem that the isolated study of a few single-task models can hardly meet the sophisticated real clinical examination environment with full-featured intelligent assisted diagnosis,this paper synthesizes the above and the temporal decoupling method,and through the study of representational similarity in deep learning models,we propose a multi-task integrated model method based on task correlation analysis,which is based on single models of different tasks,from different levels of the model to quantitatively extract the RSA correlation coefficient matrix characterizing the correlation between multiple tasks,and to design and construct a layer-by-layer progressive end-to-end multitask integration model with this task correlation as the knowledge,to fully realize the feature sharing and information transfer between tasks and improve the inference and time performance of the model.Based on this approach,a multi-task integrated model containing nine GI endoscopy-oriented intelligent diagnostic tasks is implemented in this paper.Compared with the conventional combined model of multiple single-task models,the overall model volume is reduced by 42%,the real-time inference speed is increased by a factor of 1,and the model accuracy of each task maintains comparable performance or even slightly exceeds it.4)Design and practice of a computer-aided detection and diagnosis system based on taskrelated multi-task integration modelBased on the research results of the above multi-task integration model,this paper developed a GI endoscopy video-assisted detection and diagnosis system and evaluated and validated it on a real GI endoscopy video dataset.The experimental results show that the system meets the clinical requirements for multifunctional intelligent assisted diagnosis in real time and improves the model performance in terms of lesion detection rate,real time and accuracy.
Keywords/Search Tags:Gastrointestinal Endoscopy, Computer Aided Diagnosis, Multi-task Learning, Image Understanding, Convolutional Neural Networks
PDF Full Text Request
Related items