| As one of the important branches in the field of image processing,scene text detection is widely used in information retrieval,intelligent office,smart city and other fields.In this context,many scholars at home and abroad have conducted in-depth research on some challenging scene text detection tasks.However,there are still some problems in scene text detection,such as small target text is not easy to detect,text angle is diverse,arbitrary-shape text is difficult to detect accurately.Text detection in natural scenes still faces great challenges.Therefore,based on the improved YOLOv5 s,this paper respectively proposed an arbitrary-direction text detection model and an arbitrary-shape text detection model to locate the text position in the natural scene image.The main contents of this paper are as follows:(1)Aiming at the problems that small target text in scene images is difficult to detect and text angles are diverse,this paper proposes an arbitrary-direction text detection model of YOLOv5 s based on attention mechanism and rotation frame localization.First,this paper added a set of detection heads to the classical YOLOv5 s network structure,and then embedded five different attention modules on this basis to explore the enhancement effect of different attention mechanisms on model detection ability.Secondly,the Circular Smooth Label(CSL)is introduced to transform the angle regression problem into a classification problem and detect arbitrary-direction text.Besides,the arbitrary-direction text datasets CTST of traffic scene is made,and experimental comparative analysis was conducted on CTST,a self-made datasets,and ICDAR2015,a public datasets.Experimental results show that compared with the classic arbitrary-direction text detection model EAST,F-measure of the proposed model in CTST and ICDAR2015 datasets increases by 7.2% and 5.2%respectively,and the detection speed is better than other comparison models,which verifies the feasibility and advancement of the proposed model.Finally,perturbations of illumination intensity and resolution are performed on CTST test set and ICDAR2015 test set respectively to verify the robustness of the model.(2)Aiming at the problem that arbitrary-shape text in scene images is difficult to be accurately detected,this paper proposes an arbitrary-shape text detection model based on improved YOLOv5 s and improved Canny algorithm.Firstly,the improved YOLOv5 s model is used as the whole text region detection module of the arbitrary-shape text detection model,and the text region in the image is preliminary detected.Secondly,an adaptive local Gaussian filtering method and a regional gradient adaptive double threshold edge detection method are designed to improve the Canny algorithm.Then,the improved Canny algorithm and the connected domain detection algorithm are used to detect the connected domain in the picture,and the connected domain is screened according to the filtering principle.The connected domain is marked as a text component,and the text component is merged into the text line to achieve the detection effect of arbitrary shape.In addition,a comparative analysis is carried out on the arbitrary shape text datasets CTW1500 and Total-text.The experimental results show that compared with the arbitrary-shape text detection model DBNet,F-measure of the proposed model improves by 1.3% and 1.1% respectively on the CTW1500 and Total-text datasets,which verifies the feasibility and advanced of the proposed model.Finally,perturbations of illumination intensity and clarity are performed on CTW1500 test set and Total-text test set respectively to verify the robustness of the model. |