Font Size: a A A

Research On Text Location Based On Adaboost In Natural Images

Posted on:2017-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhengFull Text:PDF
GTID:2348330482484840Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of multimedia network technology, a large number of scene images are into people’s study, life and work. As a kind of important semantic information, text information in the scene has an important role in understanding analysis and retrieval of the scene. Because the font, color, size and background of text in natural scene is very complex, text location has certain difficulty. Thus text location in natural scene becomes a difficult and challenging research topic. Based on the research and summary of text location method in recent 10 years, This paper researches the text location of natural scene deeply. A novel text localization method based on Adaboost in natural images is proposed in this paper. Scene text localization algorithm of this paper mainly includes four parts, respectively is the pretreatment stage, the candidate areas generation stage,feature extraction stage and classification of the candidate text regions stage.In the pretreatment stage, comparing the three methods of gray processing,including maximum value method, mean value method and weighted average value method, through analyzing the advantages and disadvantages of the experimental results, this paper chooses the weighted average method to process scene image. This paper puts forward a edge detection algorithm based on improved Sobel operator.The experimental result shows that this method not only can effectively extract the image edge, and can effectively solve the problem of edge leak detection, and has a certain antinoise performance.In the process of generating candidate text area, this paper puts forward using text size and edge density characteristics to represent text characters. The experimental result shows that using these two features to analyze and select connected area that can eliminate the areas that obviously belonging to the text area, finally getting the candidate text regions.This paper extracts 4 kinds of scene text characteristic, including the Gabor features, stroke density, texture and image characteristics of derivative variance and expectations. The experimental results show that the classifiers are established by four kinds of text characteristic all have certain effect for classification of text area.This paper puts forward text location based on adaboost in natural images by improving the classical Adaboost algorithm. This paper uses CART(Classification And Regression Tree) to construct Adaboost strong classifier. Four kinds of weak text classifiers are generated with the four types of text feature that are combined by Adaboost with CART. So a strong Adaboost classifier that has ability of the text area classification is generated. And then the strong Adaboost classifier to screen candidate text area use to get the correct text areas.Through the study of ICDAR2003 English images database, 300 Chinese scene images database is presented in this paper, of which 200 images as the training samples, 100 images as test samples. The precision rate of text location is82.8%, and the recall rate of text location is 85.8%. The experimental results show that this method can not only achieve a good effect on the text location in the natural images with text of various fonts, sizes and colors, but also realize high recall rate and precision rate.
Keywords/Search Tags:text location, text recognition, connected domain, classification and regression tree
PDF Full Text Request
Related items