Research On Text Detection Methods In Natural Scenes

Posted on:2024-05-21

Degree:Master

Type:Thesis

Country:China

Candidate:Y C Su

Full Text:PDF

GTID:2568307118984229

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The natural scene text detection task is an important part of the computer vision field,aiming at discovering and locating text regions from natural scene images and outputting their location and shape information.Current deep learning based methods for scene text detection have made significant progress,and as research continues,the focus of scene text detection has shifted from horizontal or multi-directional text detection to arbitrary-shaped text detection.However,due to the drastic changes in font,size,color,and direction of arbitrary-shaped texts,the detection results are still unsatisfactory.Currently,there are two main challenges in arbitrary-shaped text detection.One is designing an excellent text instance representation that allows the model to effectively learn the geometric changes of different texts.Existing methods mainly model text instances by regressing the mask or contour point sequence of text regions.However,it is difficult to balance the training complexity and modeling quality of the mask,and the limited number of point sequences is insufficient to capture the contour details of complex text.The second challenge is designing a concise and efficient model without post-processing to accurately learn text instance representation,because the learning ability of existing models is unsatisfactory.To address these two challenges,this work proposes two scene text detection models with the following innovations:(1)Considering current text instance representation methods which are difficult to fit extremely long or curved text accurately,this work proposes a text mask representation method based on the discrete cosine transform.This method utilizes the low-frequency component of the discrete cosine transform to represent the text mask,resulting in a lower training complexity and higher representation quality.Furtermore,to address the issue of sample imbalance in current regression methods based on the divide and conquer strategy,this work proposes a single-level prediction framework.A feature-aware module is designed to obtain rich contextual information and adaptively adjust the receptive field to achieve spatial and scale awareness.Additionally,a text kernel sampling strategy is introduced to adaptively adjust the number of positive samples for balancing the text regression at different scales in the single-level prediction process.(2)To tackle the difficulty in perceiving the entire appearance of complex text using only single regression,this work proposes a text detection method based on multistage contour optimization.A contour optimization module based on the transformer is designed to correct large-scale contour prediction errors by efficiently obtaining global information,and precise and accurate text contour representation is achieved by cascading multiple contour optimization modules.To address the problem of error accumulation in current multi-stage methods,an adaptive training strategy is proposed in this work to enhance the correction capability of the contour optimization module by increasing the potential learning paths of contour optimization.Furthermore,a re-score mechanism is proposed to evaluate the contour confidence at each stage,which suppresses the appearance of false positive samples and improves the classification scores of missed texts.The experimental results on multiple public datasets such as CTW1500 and ICDAR2015 demonstrate that the two scene text detection models proposed in this work effectively address the main challenges of arbitrary-shaped text detection in current natural scenes.

Keywords/Search Tags:

Natural scenes, text detection, discrete cosine transform, transformer, adaptive training strategy

PDF Full Text Request

Related items

1	Research And Hardware Design Of Discrete Cosine Transform
2	Text Line Based Text Detection In Natural Scenes
3	Research Of Adaptive Robust Watermarking Algorithm For Digital Images On Transform Domain
4	Adaptive Speech Enhancement Based On Discrete Cosine Transform In High Noise Environment
5	Design And Implementation Of Text Detection And Recognition System For Natural Scenes
6	Resarch On Recursive Imdct Algorithms And Design Of An Audio DSP Core
7	Algorithm Research On Text Information Hiding Based On Transform Domain
8	Design And Implementation Of Discrete Cosine Transformer Based On Approximate Calculation
9	Research On Natural Scene Text Detection And Recognition Technology Based On Deep Learning
10	Research And Application Of Arbitrary Shape Text Detection Algorithm In Natural Scenes