| In recent years,text recognition in natural scene im ages h as become an important research direction in comp uter vision.Due to the size,orientation,font,language diversity of the scene,and the variability of lighting,perspective,and background,Bringing great challenges to text recognition.How to improve the text recognition rate in natural scenes,especially the accurate recognition of Chinese characters with complex stroke structure is very theoretical and practical.Based on the analysis of the structure relationship between Chinese ch aracters,this paper proposes an end-to-end scene text recognition method.The main work is as follows:Firstly,in order to effectively detect the text area of any size,arbitrary direction and any color in the scene,The image is sent to the end-to-end network model.The ResNet-34+FPN-based text positioning module performs text detection to obtain a series of text area candidate boxes;Furthermore,A fit correction algorithm is proposed to fit the text box for the case where the text candidate frame is not completely enclosed;Then,a series of character region candidate boxs are extracted from the scene image by using the remarkable visual characteristics of the character characters;Next,each character region candidate box included in each text region is used as the node information of the graph,and the relative distance between the candidate boxes of the two-two characters,the average distance from the coordinate origin,and the character candidates included in the two neighborh oods.These three kinds of feature information are combined as the sides of th e graph to establish a graph structure;Then,Performing graph segmentation using spectral clustering algorithm for each graph structure,and establish cluster evaluation function.Evaluation and selection of each clustering result according to the clustering evaluation function;F inally,Combine the selected optimal clustering results and send them to the FCN-based identification module for identification to obtain text information.The innovation of this method lies in:(1)In view of the inaccuracy of traditional character segmentation,this paper proposes a spectral clustering text segmentation method based on clustering performance evaluation,which can directly segment Chinese text lines in Multidirectional natural scenes directly,and improve Chinese in The recognition rate in the scene;(2)The text positioning module for the end-to-end model does not fully reflect the distortion of the text,this paper proposes an optimal text box selection method based on the objective function.In the MSRA-TD500,ICDAR2017-MLT,ICDAR2017-RCTW-17 three data sets,the method proposed in this paper is experimentally verified.Firstly,the recognition rate of the end-to-end network model proposed by Michal Busta et al in 2018 in the Chinese picture of MSRA-TD500 data set is 86.36%,the method of this paper is 88.26%;secondly,the recognition rate of the author in the Chinese picture of ICDAR2017-MLT is 93.04%,the method of this paper is 95.57%;finally,the recognition rate of the author in the Chinese picture of ICDAR2017-RCTW-17 is 90.73%,the method of this paper is 91.02%;the average recognition rate of this paper is about 2%higher than that of the authors on the three datasets.Therefore,the results show that the spectral clustering method proposed in Chinese can effectively segment a single complete Chinese character.Improving the recognition rate of Chinese characters,which can be applied to vision-based applications such as robot navigation and blind reading. |