Learning-Based Text Extraction In Natural Background

Posted on:2008-08-08

Degree:Master

Type:Thesis

Country:China

Candidate:R J Jiang

Full Text:PDF

GTID:2178360212976072

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Text recognition in natural scenes is very helpful to many important applications. Because of the complex background and various character appearances, the application is impeded by the shortage of the technology of localization and segmentation. After many researches, the thesis represents a novel algorithm to extract text from natural scenes, which is based on machine learning. First, the algorithm decomposes input image into multiple CCs by NLNiblack, including text CCs and non-text CCs. To localize and segment text from background, our purpose is to preserve text CCs while discard non-text CCs. 17 text features are proposed to discriminate texts from non-texts. Then, all CCs are verified by a 2-stage classification model composed by a cascade classifier and an SVM. The cascade consists of 17 weak classifiers, each concentrating on one feature. The first weak classifier is fed all CCs. If the CC is considered as non-text, it will be rejected immediately; else, it will be input into next weak classifier. Each classifier is working in this way until the end of the cascade. Most of the non-text CCs are filtered by cascade, and the SVM does further verification to get more precise result. The final output is binary image with text only. The combination of weak classifier strong classifier guarantees the efficiency and effectiveness of the algorithm. The thesis proposes a pixel-wise criterion to evaluate algorithm on testing set. The testing result shows a satisfactory performance of the method.

Keywords/Search Tags:

Text Extraction, Text Localization, Text Segmentation, Text Features, 2-Stage Classification, Cascade Classifier

PDF Full Text Request

Related items

1	Reasearch On Video Text Information Extraction Based On Features Integration
2	Research On The Technology Of Video Text Information Extraction
3	Research On Text Extraction Technology In Video
4	Research On The Location And Segmentation Of Unconstrained Text In Images
5	Research On Keyword Extraction Technology Oriented To Conversational Text
6	Chinese Text Automatic Classification System - Of Chinese Words And Classifier Design
7	Techniques For Text Extraction In Videos
8	Sentence Extraction And Reduction For Indonesian Text Summarization
9	Research On Embed Text Extraction From Still Images
10	Research On Network Text Classification Technique