| With the development of information technology,more and more information is transmitted through pictures.Recognition of the text in these pictures can better understand the picture.In recent years,the method of scene text sequence recognition based on deep neural network has been greatly improved compared with the traditional method.The purpose of this paper is to study the method of text sequence recognition in natural scenes and its application effect,in order to improve the convergence speed and recognition effect of learning model.The main work accomplished in this paper includes:(1)A coding and decoding method based on stacked convolution neural network is proposed.Traditional codec recognition framework uses cyclic neural network as codec,which can only perform serial operations in the stage of model training and testing,and can not efficiently utilize the parallel operation module of GPU.Using stacked convolutional neural network as codec,we can effectively utilize the known label information and the independence of convolutional neural network to achieve parallel training in the training stage.At the same time,the memory range of semantics can be controlled artificially by changing the layers of stacked convolution,which is more in line with the task of scene text sequence recognition.In addition,the framework includes a feature extraction model of scene text image based on residual neural network,which realizes the automatic extraction of text image features,and has stronger robustness for the recognition of more complex text sequence in the scene.(2)According to the single peak of the attention distribution of scene text sequence recognition task,the prior knowledge is used to improve the attention calculation.Most of the existing attention calculation methods generate the attention weight of the corresponding eigenvectors by matrix multiplication.In this paper,two methods for calculating prior attention are proposed.The experimental results show that the proposed method can accelerate the convergence of the model and improve the performance of the model in some data sets.(3)A parallel recognition method of scene text sequence is proposed.Most of the existing scene character recognition methods are serial in the testing stage,that is,they need to recognize character by character from the first character.The improved method is based on stacked convolution recognition network.After extracting the image feature vectors,a new branch of character number prediction is added to predict the number of image characters,which is trained jointly with the backbone recognition network.Experiments show that the parallel testing method can achieve parallel testing at a loss of accuracy(about 3%)and the testing speed is three times faster than the previous method. |