| Scene text detection,a significant task in the computer vision community,aims at localizing text with bounding boxes of words or text lines in a given image.The text in the image plays a key role in acquisition and understanding of scene information,and it is also critical for various applications such as automatic driving,license plate and ticket recognition,intelligent robots,image retrieval and big data industries.Scene text detection has made considerable progress with the development of deep learning recently.Many text detection algorithms have been proposed and achieved impressive performance on various benchmarks.However,giving consideration to the specific characteristics of texts and complexity of backgrounds,text detection is still a challenging task.Therefore,the thesis proposes two text detection algorithms based on convolutional neural networks.To alleviate the problems about the limitation of receptive field and the loss of space information in Feature Pyramid Network,this thesis proposed a text detection network based on Hybrid feature enhancement and Low-level feature refinement.The model consists of Hybrid feature enhancement module(HFEM)and Low-level feature refinement module(LFRM).HFEM adopts strip pooling to capture long-range information to deal with long text and convolution operation to obtain multi-scale local information.In addition,LFRM uses attention modules to suppress the background information and skipping connection structure to refine the feature maps.Finally low-level semantic information is transferred to high-level features to enrich spatial information.Aiming at the multi-scale problem in text detection,this thesis proposed a text detection network based on adaptive multi-scale selection network.The network consists of Adaptive feature fusion module(AFFM)and Adaptive atrous conv module(AACM).AFFM weights the importance of the features at different levels during concatenation.AACM uses parallel atrous convolution and global pooling to capture multi-scale information.Each branch is weighted by adaptive multi-scale selection network,so as to select the features with different receptive fields.To validate the effectiveness of proposed methods,extensive experiments on three benchmark datasets including ICDAR 2015,MSRA-TD500 and MLT-2017.Compared with the baseline,the methods proposed in this thesis achieve good performance in presicion,recall and F-measure,which proves the effectiveness of the method proposed in this paper. |