In recent years,the field of computer vision has developed rapidly.Optical character recognition(OCR),as an important branch of computer vision.is designed to accurately identify the included text from the image.In real life,scanned documents as an important carrier of text,accurate OCR recognition of these documents can greatly reduce labor costs.However,in the process of obtaining a document image,there are many factors that will cause the final document image to be in the wrong direction.For example,the document is not placed in the correct direction before scanning,or the obtained image orientation information is lost.This will affect the OCR recognition and subsequent image processing work.Aiming at the four directions that the document image may belong to.this paper presents a character-based orientation correction algorithm of document image.This algorithm can detect the direction of the document by extracting the characters contained in the document image and analyzing their orientation.The main work of this study is as follows:1.The proposed character-based orientation correction method of document image,which is using the text line detection and character segmentation method to extract characters,then the direction of the document can be obtained by detecting directions of these characters.The accuracy of this method tested on the CASIA-HWDB2.1 document dataset is 97%.2.A character segmentation method based on fully convolutional network(FCN)is proposed,this method can detect text line image column by column to detect whether this column should be the split line or not.It outperforms traditional image processing methods which design features manually especially in touching character segmentation.3.A four-direction classification method of character image based on the residual neural network(ResNet)is proposed,the accuracy on the CASIA-HWDB1.1 character dataset is 98.4%.The character-based orientation correction algorithm of document image proposed in this paper has high accuracy and practicality,and it can adapt to various types of documents without manually designed features. |