| Writing is an important part of human civilization and one of the most important communication tools for mankind.Compared with images,text contains richer and more direct semantic information.In real life,the phenomenon of image and text blending can be seen everywhere,and extracting the text information in the image is very helpful to understand the image.In recent years,scene text detection has received increasing attention,and many novel models and methods have emerged,which have driven the development of tasks such as text recognition and scene understanding.In the scene text detection task,the scenes in the picture are mostly derived from daily life,whose background is complex.The color,shape,and size of the text are rich and changeable.Besides,there may be multilingual text in the same scene.These factors make the task full of challenges.In addition,the geometric description of the text region in the text detection task is also changing and upgrading,from horizontal text detection to oriented text detection,and from oriented text detection to curved text detection.This thesis has carried out research on oriented text detection and curved text detection respectively.The main research work and innovations are as follows:Firstly,in the research of oriented text detection,this thesis improves the shortcomings of the limited receptive field of the EAST model.First,the ASPP is introduced into the deep layer of the backbone network to increase the network’s receptive field.Then,this thesis adds a center-aware branch to the original double branch of the EAST model to suppress the low-quality prediction frame generated by pixels that deviate from the center of the text,which improves the detection accuracy and reduces the amount of calculation of the subsequent LNMS algorithm.Finally,this thesis optimizes the loss function and uses a multi-scale detection method.The experimental results show that the improved model can achieve an F-measure of 83.9%on the ICDAR-2015,which proves the effectiveness of the improved model.Secondly,in the research of curved text detection,based on PA-Net,this thesis proposes a novel feature fusion module(Y-FPEM+AFFM)in view of the insufficient feature extraction capabilities of its lightweight backbone network.The Y-FPEM component aims to enhance the features extracted from the backbone network,in which shortcut connections are added to the original FPEM component.The AFFM component merges the features enhanced by Y-FPEM by introducing an attention mechanism.The experimental results show that the F-measure on CTW-1500 and Total-Text respectively reached 84.2%and 85.0%,which proves the improvement of the model text detection ability. |