Research On Degraded Document Image Binarization Algorithm Based On Deep Learning

Posted on:2021-03-07

Degree:Master

Type:Thesis

Country:China

Candidate:X H Jia

Full Text:PDF

GTID:2428330629486092

Subject:Electrical engineering

Abstract/Summary:

PDF Full Text Request

As a pre-processing step of the document analysis system,the binary segmentation of text and background,this process plays a key role in the accuracy and high visual quality of the extracted text,such as character recognition.Most binarization algorithms are built on low-level features in an unsupervised manner,so the input domain knowledge cannot be fully utilized,which greatly limits the distinction between foreground text and background noise.With the wide application of deep learning in various fields of computer vision,researchers began to use deep learning models to solve binarization problems,and achieved good segmentation results.In response to this,this paper focuses on the low-quality document image binarization algorithm based on deep learning.The main work and innovations are as follows:(1)Introduced twelve binary algorithms,including six classic traditional algorithms and six latest algorithms based on deep learning.Each algorithm was briefly summarized,and the advantages and disadvantages of the algorithm were analyzed through experimental results.(2)The first algorithm is to address the problem of limited neural network training data sets,and propose a text enhancement network(TANet)to expand the data set,making full use of existing document images;then the improved D-LinkNet network(MD-LinkNet)as a binary segmentation network.The network has two improvements,one is to add the remaining multi-core pooling(RMP)module and the cascaded hole convolution(CAC)module in the middle part of the codec to extract rich document stroke features;the second is to pool the low resolution The rate image uses DUpsample instead of traditional bilinear interpolation for upsampling,which combines the pixel neighborhood information of the document image.Using the data set and evaluation indicators provided by the International Document Image Binarization Contest(DIBCO),the algorithm is compared with twelve kinds of binarization algorithms.The experimental results show that the F value of Algorithm 1 is compared The sub-optimal U-Net has a 3.5% improvement.(3)The second algorithm aims at the uneven distribution of text in historical document images,which leads to noise in the binary segmentation of a single neural network.A cascaded convolutional neural network is proposed to solve the core problem of multi-scale information fusion of binary tasks.The algorithm first uses the U-Net network as the basic segmentation,which aims to retain the complete stroke information of the document;then the image test results of different proportions are fused and sent to the MD-LinkNet proposed by the algorithm for training and testing;and finally the convolution conditions are used Random field(ConvCRF)is post-processed to remove isolated noise points.Experimental results show that,while retaining the complete strokes,the algorithm can better suppress noise for document images with small text.

Keywords/Search Tags:

document image binarization, U-Net, D-LinkNet, dilated convolution, GAN

PDF Full Text Request

Related items

1	Analysis And Research On Binarization Problem Of Degraded Document Image
2	Study On Binarization For Hybrid Document Image
3	Low-quality Document Image Binarization Study
4	Binarization Method For Non-uniformly Illuminated Document Images
5	Research On Degraded Document Image Binarization
6	Research On Degraded Document Image Binarization Methods Based On Fully Convolutional Networks
7	Research On Image Compressed Sensing Algorithm Based On Dilated Convolution
8	Application And Research Of Image Mosaicing In Document Image Distortion Recognition
9	Research On Historical Document Image Binarization
10	A Variational Model For Document Image Binarization