In recent years,with the rapid development and wide application of deep learning,human daily life has undergone dramatic changes,and human-computer interaction has become more simple and efficient.As an important branch of computer vision,text recognition has been applied everywhere in life.In the field of education,the answer cards for objective questions have already been read and judged automatically,but for non-selected questions,the scanning image is still used to determine the paper manually.It is an important direction to realize the automatic scoring process for subjective questions.Thanks to the high-speed development and large-scale application of text recognition technology in today’s era,the content of the subjective subject with scanned image as the carrier can be transcribed into a character encoding form that can be further processed by the computer,which makes the automation of the whole reading and judging process possible.This paper first collects the scanned data of the examinee’s subjective question card in the real scene including the college entrance examination.On this basis,the handwritten text in the answer card can be recognized and transcribed by using the related technologies of text detection and text recognition,which makes it possible for the subsequent automatic judging and permanent storage of the examination paper.Unlike text detection and recognition in normal natural scenes,due to the distribution of the answer card itself and the different writing habits and styles of candidates,detection and recognition of handwriting text has its own huge challenges.This paper presents a method for recognition of handwritten chinese text in examination paper and a method for detecting handwritten Chinese text in examination paper according to the characteristics of scanned images of answer cards and candidates’ handwriting.In addition,it is integrated and encapsulated to achieve an automatic examination paper recognition system.The main work is as follows:(1)Collect and organize the handwritten Chinese text recognition dataset and handwritten Chinese text detection dataset labeled under the real scene,which provides a strong data support for the training of the model.(2)A examination paper handwritten Chinese text detection method is proposed.Considering that the existing regression-based methods can make it difficult to detect long text or merge multiple lines of text into one line of text,a segmentation-based text detection method is proposed;A proposal module is proposed,which generates a proposal map that roughly contains the text area through multiple projections,and the prediction results are guided by the calculation of the probability map;For pixellevel binary map,a "climber" algorithm is proposed to split line-level text instances;For the split text bounding box,a corner sniffing method and a bounding box merging algorithm are proposed to adjust the coordinates of the bounding box and optimize the prediction results.This method has been trained and tested on the handwritten Chinese text detection dataset,compared with the mainstream methods,and ablation experiments of each module of the model.Finally,the effect of the model is visualized.(3)A examination paper handwritten Chinese text recognition method is proposed.Due to the changeable shape of handwriting and the difficulty in distinguishing near characters,ResNet_vd is used as backbone network to avoid losing part of the information during the downsampling process in ResNet;In order for the model to focus on the pixel with text information during the feature extraction phase,an attention mechanism is proposed to apply on each column of the feature map;Models the context information contained in the text line,and trains and predicts the model with context information.Based on the modularization of the entire method process,a new method procedure is achieved by removing the feature serialization and context information modeling process,which enables the recognition of multiple lines of text paragraphs.This method has been trained and tested on the handwritten Chinese text recognition dataset,compared with the mainstream methods,and ablation experiments of each module of the model.Finally,the effect of the model is visualized.(4)Design and implement an automatic examination paper recognition system.By fusing the text detection method with the text recognition method,a complete process from image reading to result output is realized,which helps the examination staff to automate the whole process of reading the examination paper. |