| Text is the most important tool to convey information in daily life.With the development of multimedia technology and computer vision,automatic text detection and recognition from images has become more and more necessary for the development of science and technology.Technical difficulties have also been transformed from document text recognition to detection and recognition under natural scenes.Due to the complex background,diverse texts and uncertain image quality of natural scenes,it is difficult to obtain good results by using the traditional method of manual features.However,the detection and recognition method based on deep learning will bring new ideas and challenges to this subject.The main contents of this thesis are as follows:1.This thesis proposes a large-scale artificial synthesis Chinese text recognition database.This thesis first collected 1,000 background images,50 fonts and 17,887 expected words,and developed a fast text line-level recognition image synthesis engine,which fully considered the diversity of text font,color,noise,angle and character frequency balance,and generated half a million near-real recognition images.Then the evaluation index of the database is explained,and the baseline test is carried out on the data set using the classical algorithm,and the effectiveness of the data set is analyzed.2.This thesis proposes a multi-direction text detection method based on prior boxes.In this thesis,a deformable convolution module is introduced into the feature extraction module to dynamically adjust the receptive field according to the sequential features of the text line target based on the candidate box-based target detection algorithm.Secondly,in the detection stage,this thesis designed the size of the default box according to the aspect ratio statistics of the text box in the natural scene and used horizontal and vertical detection branches.Finally,in the post-processing stage,the overlapping frame judgment is added for the structural characteristics of Chinese,which further improves its performance.3.This thesis proposes a multi-direction text detection method combined with segmentation.Based on a direct regression algorithm,the loss function and the postprocessing part are improved.In this thesis,distance coefficient is added to the loss function to solve the adhesion problem of adjacent text line segmentation results.In the grouping step of post-processing,whether the pixels are connected in the segmentation map is adopted to group.In the merging step,the text boxes in the same grouping are fitted with a straight line for their long edges,and the shortest edges are optimized.The final algorithm improves the accuracy while preserving the speed.4.This thesis proposes a scene text recognition method based on two-dimensional LSTM.In this thesis,based on the CRNN algorithm and according to the characteristics of Chinese text,the feature extraction part is improved.By introducing dilated convolution to maintain the size of the feature map and improve the receptive field and adding two-dimensional LSTM module to extract more complex structural characteristics of Chinese,a more detailed feature map is obtained.Finally,through the experimental comparison between synthetic database and natural scene database,the algorithm in this chapter is more effective in Chinese text recognition. |