Font Size: a A A

Research On Desensitization Technology For Chinese Medical Report Images

Posted on:2024-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2544306941463744Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and widespread adoption of healthcare information systems,patient examination results and diagnostic information are often presented in the form of medical reports.Medical report images have become important information resources in fields,such as smart healthcare,online consultation,and medical research.While facilitating the use of relevant personnel,it is of paramount importance to ensure information security and privacy protection.Currently,although technologies of medical text desensitization and image desensitization have been proposed,research on desensitization of Chinese medical report images is still lacking.The purpose of this thesis is to investigate the desensitization technologies of sensitive printed information and Chinese handwritten signatures in Chinese medical report images,which can be concluded to the following three aspects.(1)A printed sensitive information desensitization framework(PSI-DF)is proposed for addressing the issue of desensitizing sensitive printed information in Chinese medical report images.PSI-DF consists of five modules:sensitive information construction,text detection and text recognition,sensitive information extraction,sensitive information location,and sensitive information desensitization.The unique characteristic of PSI-DF is the sensitive information location module,in which a dynamical segmentation algorithm is presented to accurately locate sensitive information.Besides the sensitive information location module,the other modules of PSI-DF can be flexibly designed according to specific requirements,making PSI-DF highly convenient to use.Because there is no publicly available Chinese medical report image dataset,we collect Chinese medical report images from Internet and private contributions,and annotate the sensitive information regions in these images,constructing a Chinese medical report image(CMRI)dataset for algorithm testing.In addition,we design a metric,called valid desensitization(VD),to evaluate the desensitization performance of the algorithm.Experimental results show that the proposed VD metric is a reasonable index for evaluating the desensitization performance of the algorithm.Moreover,findings demonstrate that PSI-DF has a satisfactory performance in desensitizing sensitive printed information in medical report images.(2)A handwritten signature sensitive information desensitization framework(HSSIDF)is proposed to address the issue of inaccurate detection of handwritten signatures in Chinese medical report images,which is caused by the text detection model in PSI-DF.HSSI-DF consists of two modules:handwritten signature detection and handwritten signature desensitization.Currently,there is a lack of publicly available medical report image dataset that contain Chinese handwritten signatures,and the number of such images in the CMRI dataset collected in this thesis is also very limited.To ensure the performance of the handwritten signature detection model,we construct a Chinese medical report fusion image(CMRFI)dataset containing Chinese handwritten signatures by fusing the Chinese handwritten signatures with the Chinese medical report templates and using data augmentation technologies to process these fused images.In the handwritten signature detection module,we train an object detection model using the CMRFI dataset to effectively detect Chinese handwritten signatures.Experimental results show that the proposed HSSI-DF has a satisfactory detection and desensitization performance for Chinese handwritten signatures.(3)A comprehensive sensitive information desensitization framework(CSI-DF)is proposed,and a desensitization system for Chinese medical report images is designed and implemented.To address the issue of simultaneous presence of sensitive printed information and handwritten signature in Chinese medical report images,this thesis integrates and optimizes the two proposed desensitization frameworks,and then proposes CSI-DF.Experimental results show that the proposed CSI-DF can effectively desensitize both the printed sensitive information and handwritten signature sensitive information in Chinese medical report images.Based on the three proposed desensitization frameworks,a desensitization system for Chinese medical report images is designed and implemented.Users can conveniently upload,desensitize,and download medical report images through this system.For sensitive terms that cannot be effectively desensitized,users can provide feedback directly.Meanwhile,system administrators can easily manage the keyword list and framework parameters.After verification,administrators can add the sensitive terms provided by users to the keyword list to ensure effective desensitization of these sensitive terms.
Keywords/Search Tags:Medical Report Images, Sensitive Information Desensitization, Optical Character Recognition, Named Entity Recognition, Object Detection
PDF Full Text Request
Related items