Font Size: a A A

Research On Content Extraction Technology Of Laboratory Test Sheet Based On Deep Learning

Posted on:2022-09-15Degree:MasterType:Thesis
Country:ChinaCandidate:G X LiangFull Text:PDF
GTID:2494306563976789Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
The development of artificial intelligence has injected strong vitality into the medical industry.The medical auxiliary diagnosis system based on artificial intelligence technology can effectively solve the pressure of "more patients and less doctors",reduce the misdiagnosis phenomenon in medicine,and reduce the occurrence of doctor-patient disputes.It is the guarantee for the safe and reliable operation of the auxiliary medical diagnosis system to accurately convert the image content of the test sheet into structured data.In addition,the image data of the paper test sheet on the hands of patients is also an important source of medical big data.Chinese test sheet contains mixed characters in Chinese and English,symbols and numbers,and there is no relevant public data set at present,how to convert the content of the test sheet into the data that can be directly used by the structured computer,the image content recognition technology of the test sheet is particularly important.Based on the computer image processing technology,this paper studies and analyzes the related algorithms of the content recognition technology of the test sheet,and designs a set of test sheet information extraction system.The main work is summarized as follows:Firstly,the preprocessing algorithm of the test sheet image is analyzed and studied.Aiming at how to effectively extract the test sheet image area from the complex background,a fusion operator is proposed to detect the edge of the test sheet image.Compared with using a single operator,this method can still extract the edge information of the test sheet image in the case of complex background;Different binarization algorithms of test sheet images are studied,and two-dimensional Otsu algorithm is proposed to binarize test sheet images.This method can solve the problems of text fracture and a large number of ink blocks in the process of binarization.Secondly,the text detection algorithm based on deep learning is deeply studied,and a multi feature fusion text detection algorithm based on deep learning is proposed.In this method,VGG,Inception and Resnet feature extraction networks are integrated into the original East algorithm,it can effectively solve the inaccuracy of the text box obtained by the traditional detection algorithm represented by the projection method and the serious phenomenon of missing detection of the text box,especially the long text box in the current deep learning method.Moreover,the text recognition algorithm based on single word recognition is analyzed and studied,and based on Tesseract-OCR single word recognition model,an improved method is proposed.This method uses the text detection algorithm proposed in this paper to replace the text region detection module in the original model,and cooperate with the recognition module of Tesseract model for recognition tasks.The recognition accuracy has been greatly improved,but it is still limited by the effect of character segmentation in the model,and the recognition effect is not good for similar characters and left and right structural characters.Therefore,an improved CRNN end-to-end sequence recognition model is proposed.Based on CRNN,ResNet101-IBN(b)network is used for feature extraction,and feature reuse technique is used to deepen the network depth and improve the performance of the model,and the recognition accuracy is further improved.Finally,based on PyQt platform,a set of Chinese test sheet content extraction system is developed.After debugging and verification,the system can better complete the recognition task,transform the image content of test sheet into structured data,and meet the use requirements.
Keywords/Search Tags:Chinese medical test sheet, Deep learning, Text detection, Character recognition, Image processing
PDF Full Text Request
Related items