Font Size: a A A

Research On Text Recognition Techniques For Distorted Text

Posted on:2024-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:M J ZouFull Text:PDF
GTID:2568307127955449Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Scene text recognition is to obtain text information from natural scene pictures.Text in scene image has rich high-level semantic information,which provides important help for computer to understand scene image.Text recognition in natural scenes can be widely used in the fields of automatic driving,intelligent security and video analysis,which is of great value.There are various forms of scene text,including different languages,fonts,sizes,colors and directions.Moreover,the local area of scene picture usually interacts with background,illumination,occlusion and noise,which greatly increases the difficulty of text recognition.Although the current text recognition field has been extensively studied,many relevant models have been proposed,and good results have been achieved,but in the face of distorted text in natural scenes,due to text deformation,strong interference of background factors and difficult to divide between characters,the recognition effect is often poor.In order to deal with the unsatisfactory recognition result of distorted text in scene pictures,this thesis focuses on the three stages of scene text recognition,namely pre-processing,text recognition and postrecognition processing,respectively.Specific research contents are as follows:(1)Aiming at the pre-processing stage of distorted text recognition,a text correction method based on multi-scale feature fusion(Spatial Transformer Networks base on InceptionResnet,STN-I)is proposed in this thesis to improve the effect of distorted text recognition.In the stage of image preprocessing,the method locates the edge points of the text area,obtains the appropriate affine transformation parameters,and carries out the affine transformation of the text image to make the text area close to the horizontal rectangle,and improves the recognition effect.In this method,a model with multi-scale feature fusion technology is used for feature extraction,so as to more accurately locate the edge points of the text area and achieve better correction effect.Experiments were carried out on multiple data sets of distorted text or slanted text,and the experimental results show that STN-I has a satisfactory effect both on the visual effect of text correction and on the quantifiable evaluation index.(2)Focusing on the recognition stage of distorted Text recognition,in order to improve the effect of distorted text recognition,this thesis proposes a text recognition method based on attention mechanism(Swin Transformer Text Recognizer).In this method,Swin Transformer,which combines the attention mechanism and convolutional neural network,is applied in the feature extraction stage of text recognition,and improved in the feature extraction stage and sequence modeling stage.The improved Swin Transformer is used to extract image features,and the location attention mechanism is used for sequence modeling.The problem of information loss in recurrent neural network is avoided.The Convolutional Recurrent Neural Network(CRNN)-based text recognition method in this thesis performs well on multiple data sets and exceeds the mainstream text recognition methods currently proposed.(3)Focusing on the post-processing stage of distorted Text recognition and correcting the output results of text recognition,this thesis proposes a text recognition method with explicit Language model(Attentional Scene Text Recognizer with Language module,ASTER-L).In the post-processing stage of text recognition,this method uses attention mechanism to build a language model,and corrects the possible misspellings in the recognition results,so as to improve the recognition effect.The language model uses the string data set for training,learning the association between characters,and re-predicting each character in the text sequence according to the dependence relationship between characters in the text sequence in the recognition process to obtain new prediction results.The proposed method has obtained competitive results on multiple scene text recognition data sets.
Keywords/Search Tags:Text recognition, Attention mechanism, Deep neural network, Text rectification, Language model
PDF Full Text Request
Related items