Font Size: a A A

Research And Applications On Detection And Recognition Algorithms For School's English Composition Texts Based On Deep Learning

Posted on:2022-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:K N WuFull Text:PDF
GTID:2480306332467584Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the gradually implementation of deep learning in various fields,more and more industries are using deep learning to start more efficient work.In the field of education,there is also an impending need for a more intelligent set of algorithms to ease the burden of teachers in marking examination papers and to increase the fairness of non-subjective assessment in the process of marking papers.Therefore,this topic uses deep learning to study the issues of text line localization and recognition,and design and implement an automated marking prototype system.In this topic,two innovations are proposed as follows.First,because there is a lack of datasets in the field of text localization that are labeled with both English and handwriting features and in text line units,so it is impossible to directly localize English handwritten text lines for test papers.Hence,a data synthesis method based on random insertion is proposed in this topic,and the text line detection dataset of English test papers required for the experiment is synthesized by this method.Taking advantage of the shape of text lines themselves,random insertion is performed by few-labeled text lines on the background image.The rules of random insertion are defined according to the characteristics of different lines in English composition,which may appear in different positions.The text line characteristics of the synthetic images are varied,increasing the sample diversity.The synthetic dataset possesses high accuracy in the text line detection model because of minimal annotation error,which proves the effectiveness of the text line detection dataset achieved by data synthesis.Secondly,because most of the previous text recognition models are based on character or word recognition without considering the semantic relationships between different layers of text features within a long text line image.Hence,a text recognition method based on multi-feature fusion is proposed in this topic.Features at different layers represent different levels of semantic information;therefore,by adding two feature extraction branches at different scales to the convolutional feature extraction layers at different depths,semantic features from the shallow character edges to the deep sequence contours can be extracted together.Experiments are conducted to compare between the model proposed in this topic,multiple baseline models and the ablation experimental models.The experiment results demonstrate the effectiveness of the idea of multi-feature fusion in this model for the improvement of recognition accuracy.Finally,an Encoder-Decoder based text scoring model is constructed,and an automated marking prototype system is designed and implemented in this project.The prototype system mainly consists of a handwritten text-line detection module,a long text-line recognition module and a text-scoring module.The results of system evaluation prove the correctness of proposed algorithms.
Keywords/Search Tags:Text-line Detection, Data Synthesis, Handwritten, Recognition Multi-feature Fusion
PDF Full Text Request
Related items