| Document image pre-processing occupies a very critical position in OCR(Optical Character Recognition)systems,and the effectiveness of pre-processing will have a direct impact on the recognition results.Traditional preprocessing algorithms based on digital image processing often have one or more parameters that need to be adjusted artificially,and these parameters are usually not universally applicable to images captured in different imaging environments,resulting in limited adaptiveness of the algorithm.In various fields of computer vision,algorithms built based on neural network theory have shown excellent performance and good generalization.It can be expected that the introduction of neural network theory into document image preprocessing will improve the overall performance of preprocessing algorithms.This thesis conducts an in-depth research and study on document image preprocessing algorithms,and proposes a document image preprocessing system based on neural network theory.The system can finish the document image pre-processing work end-to-end,and process the geometric and non-geometric interference existing in the image acquisition,in order to improve the visual quality of the original image and OCR recognition effect.The main work accomplished in this thesis is summarized as follows:1.A lightweight geometric correction netwo rk(Asymc Net)that integrates document segmentation and correction functions is proposed.After geometric correction of the original image using Asymc Net,the CER(Character Error Rate)of OCR recognition results can be reduced from 57.0% to 27.3%.Compare d with the industry’s more advanced DFE-FC,Asymc Net reduces CER by 3.2%,and the average processing time for a single image is only 8.85% of its time.2.Based on generative adversarial learning and Retinex theory,a self-supervised light correction network(Rtnx GAN)is proposed,which can be trained using unpaired training sets.The Rtnx GAN can process the non-geometric interference in images,and the CER of geometrically corrected images can be further red uced to 21.9% after being processed by the light correction network again.Compared with the fully supervised illumination correction network ILL-Net,the SSIM(Structure Similarity Index Measure)of Rtnx GAN processing result is improved by 0.024,and the PSNR(Peak Signal to Noise Ratio)is improved by 1.418.The average processing time for a single image is only 59.22% of its time.3.Based on Python and Py Qt5,the software of document image preprocessing system is completed.According to the characteristics of the algorithm,this thesis selects the corresponding hardware,completes the construction of the hardware platform of the whole pretreatment system,and verifies the actual application of the proposed pretreatment algorithm.This thesis completes the design and implementation of the whole image preprocessing algorithm based on Asymc Net and Rtnx GAN.The proposed algorithm can complete the document image preprocessing work with high robustness and improve the visual effect and OCR recognition accuracy of the image.The effectiveness and advancedness of the proposed algorithm are confirmed by comparison and ablation experiments. |