Font Size: a A A

Research On Multi-oriented Text Detection And Recognition Technology In Complex Natural Scene

Posted on:2021-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhiFull Text:PDF
GTID:2428330614471290Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet and 5G technology,digital image is becoming one of the main carriers of information dissemination.There are rich texts containing significant information in natural scene image,which help us realize the cognition of the scene at a higher semantic level.The demand to detect and recognize the text in image automatically is becoming more and more,which can be applied to unmanned driving system,image search,intelligent transportation system and other fields.However,despite the traditional optical character recognition technology has been perfected,a series of challenges may still be encountered when recognizing text in natural scene,for example,complexity of backgrounds,diversity of fonts,variability of text and so on.This thesis studies the techniques for text detection and recognition in natural scene in depth,with the focuses on the detection and correction of multi-oriented text,sequence-based text recognition.The details are as follows:(1)Multi-oriented text detection in natural scene based on region proposal.CTPN is a text detection algorithm based on region proposal,which has good detection effect on horizontal text lines,but poor precision for localizing multi-oriented text.According to the characteristics of text in natural scene,this thesis improved the method of constructing text lines by text proposals on the basis of CTPN,so as to accurately locate the text area,and then corrected the tilted text through the method of spatial perspective transformation.Experimental results show that the improved method achieves 0.79 F-score on the MSRA-TD500 dataset,and completes the task of multi-oriented text detection in natural scene.(2)Text recognition in natural scene based on character semantic sequence.This thesis constructs a text recognition model which integrates feature extraction,sequence prediction and decoding into a unified network model.Firstly,feature sequence are extracted by Dense Net convolutional neural network from image,and then a LSTM network is built for making prediction for the feature sequence.At last the predicted sequence is translated into the character sequence and output by using decoding mechanism.At the same time,a large dataset for text recognition in natural scene is made by artificial synthesis to train the proposed model.Finally,the proposed model is evaluated on the public dataset,and the experiment is designed to explore the effect of LSTM network on character sequence prediction and the different effects of CTC and Attention decoding mechanisms on character sequence translation.The experimental results show that the network architecture containing Dense Net,LSTM and CTC has the best recognition performance.(3)Design of natural scene text detection and recognition model.Based on the proposed natural scene text detection and recognition algorithm,the two parts model of text detection and recognition are combined to design an end-to-end system for natural scene text recognition.In addition,for slanted multi-oriented text line,the text correction module can realize horizontal normalization and improve the accuracy of text recognition.To sum up,this thesis realizes an end-to-end natural scene text recognition model by solving two key problems of scene text detection and recognition.Experiments on datasets including ICDAR 2013 and ICDAR 2017 RCTW demonstrate that the proposed scene text recognition model has excellent performance in terms of both precision and recall,and the task of text recognition in natural scene is well accomplished.There are 50 graphs,11 tables and 62 references.
Keywords/Search Tags:Text detection, Text recognition, Natural scene, Convolutional neural network
PDF Full Text Request
Related items