Font Size: a A A

Research On Arbitrary Shape Text Detection Algorithm Based On Multi-path Fusion

Posted on:2022-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y P MinFull Text:PDF
GTID:2518306569494674Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Text detection,which is to locate the bounding box of each text instance in an image,is a hot topic in computer vision research.Although some traditional text detection methods based on manual feature extracion have achieved good results in text detection of simple background images such as bills and books,the text detection in natural scene images has not been satisfactory.Because the background of the scene text detection is very complicated.Compared with general detection,text also has its own unique properties,such as diverse shapes,irregular fonts,extreme aspect ratios,and difficulty in determining text boundaries.Aiming at the above problems,we propose a method based on multi-path fusion to realize text detection in natural scenes.We propose a text detection algorithm based on threshold map supervision.In the feature extraction module,the feature enhancement and fusion network based on separable convolution is adopted.Through the cascaded feature pyramid enhancement module,the difference is made without increasing the amount of calculation.The features of the size can be merged deeper and more expressive.Then,through the feature fusion module,the features of different depths are fused to obtain the final feature map.In the segmentation network module,each pixel in the image can be trained to get different thresholds through the threshold map supervision network,so that the text and background can be completely distinguished at the pixel level.Experiments on the public benchmark datasets Total-Text and CTW1500 show that the F1-Score of the text detection algorithm based on threshold map supervision reaches 84.6% and 83.8%,which shows the effectiveness of the method.Although text detection based on semantic segmentation can realize the detection of arbitrary shape text,it is difficult to distinguish text instances well because it only trains the network for whether the pixels in the image are text.In order to obtain a more accurate representation of arbitrary shape text,we designs an arbitrary shape text detection algorithm based on multi-path fusion based on Mask R-CNN from the perspective of instance segmentation.Extract character-level and word-level features from the detection branch and Mask branch of Mask R-CNN.And introduce semantic segmentation branch to help extract global features.The introduction of global semantic features can be used to guide the detection branch and Mask branch.And through the multi-path Feature fusion structure,characters,words and global features are fused.Experiments on the TotalText and CTW1500 datasets show that the F1-Score of the arbitrary shape text detection algorithm based on multi-path fusion reaches 85.6% and 85.8%.This result shows that the text detection algorithm based on multi-path fusion can further improve the performance of text detection.
Keywords/Search Tags:scene text detection, feature pyramid enhancement, threshold map, instance segmentation, multi-path fusion
PDF Full Text Request
Related items