Research On Character-based Direction Correction Of Document

Posted on:2020-07-13

Degree:Master

Type:Thesis

Country:China

Candidate:Z T Huang

Full Text:PDF

GTID:2428330578971053

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In recent years,the field of computer vision has developed rapidly.Optical character recognition(OCR),as an important branch of computer vision.is designed to accurately identify the included text from the image.In real life,scanned documents as an important carrier of text,accurate OCR recognition of these documents can greatly reduce labor costs.However,in the process of obtaining a document image,there are many factors that will cause the final document image to be in the wrong direction.For example,the document is not placed in the correct direction before scanning,or the obtained image orientation information is lost.This will affect the OCR recognition and subsequent image processing work.Aiming at the four directions that the document image may belong to.this paper presents a character-based orientation correction algorithm of document image.This algorithm can detect the direction of the document by extracting the characters contained in the document image and analyzing their orientation.The main work of this study is as follows:1.The proposed character-based orientation correction method of document image,which is using the text line detection and character segmentation method to extract characters,then the direction of the document can be obtained by detecting directions of these characters.The accuracy of this method tested on the CASIA-HWDB2.1 document dataset is 97%.2.A character segmentation method based on fully convolutional network(FCN)is proposed,this method can detect text line image column by column to detect whether this column should be the split line or not.It outperforms traditional image processing methods which design features manually especially in touching character segmentation.3.A four-direction classification method of character image based on the residual neural network(ResNet)is proposed,the accuracy on the CASIA-HWDB1.1 character dataset is 98.4%.The character-based orientation correction algorithm of document image proposed in this paper has high accuracy and practicality,and it can adapt to various types of documents without manually designed features.

Keywords/Search Tags:

Document direction correction, character segmentation, character direction detection, text detection, deep learning

PDF Full Text Request

Related items

1	The Research And Application Of Deep Convolution Neural Network In OCR Problem
2	Research On Character Level Chinese Scene Text Detection And Recognition Based On Deep Learning
3	Research On The Detection And Recognition Algorithm Of Dongba Character Based On Deep Learning
4	Application Of Text Detection Based On Semantic Segmentation In Receipt Optical Character Recognition
5	Based On The Direction Of The Field Of License Plate Location And Character Segmentation
6	Research On Character Verification Code Recognition Based On Deep Learning
7	Research On3D Embossed Characters Segmentation Method Based On Variable Illumination Direction
8	Text Image Orientation Research And Application Based On Text Structural Features
9	The Research And Development Of Embedded Character Recognition Technology
10	Research On Fast Correction And Character Region Positioning Method Of Semiconductor Chip Image