The Research Of Text And Non-Text Image Classification In The Wild

Posted on:2017-12-28

Degree:Master

Type:Thesis

Country:China

Candidate:C Q Zhang

Full Text:PDF

GTID:2348330503472354

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the explosion of the information era, more and more image or video data can be spread in a variety of ways. How to obtain helpful information with a high-efficiency way from such a large volume of data has become a new challenge. Text in natural image is an important source of information that is helpful for many real-world applications, including image retrieval, human-computer interaction, navigation system, etc. The scene text reading system that consists of text reading and recognition has been extensively studied in recent years. However, only a small portion of images in a large volume of data contain text. Therefore, an efficient preprocessing algorithm that quickly distinguishes whether an image contains text or not is desirable. The main content of this thesis is described as following:In order to solve the lack of an image database for text image discrimination, we have collected a large dataset from the Internet and adopt some relative metrics. Most of images are natural images, and few of them are born-digital images and document images. Due to the diversity of text in this dataset, including layout, font, color and language type, this dataset can serve as a standard text/non-text image classification benchmark for evaluating different algorithms.From the starting point of feature encoding, we propose an effective and efficient method named CNN Coding that combines the advantages of three widely used techniques in this field: MSER, CNN and Bo W. The performance on the collected dataset shows CNN Coding outperforms other typical methods. Besides, the comparison between our method and several typical image classification algorithms can help us to further explore the difficulties and needs of this question.Considering the demand for high usability, feasibility and low time consuming, we also propose a novel convolutional neural network variant called Multi-scale Spatial Partition Network(MSP-Net) for text/non-text image classification in the wild. With the operation of multi-scale spatial partition, the network transforms image-level classification into blocklevel classification. As long as one of image blocks is classified as text block, the whole image will be regarded as text image. Furthermore, the MSP-Net predicts the results of all blocks in a single forward propagation, which is very efficient. The results and speed of MSP-Net on several datasets outperform other methods, even including the method of CNN Coding.This paper deals with several current problems of text/non-text image classification, such as benchmark, metrics, etc. After the analysis of the essentials and difficulities, two efficient and effective algorithms are proposed, which can be served as vital tools for mining textual information from a large volume of image or video data.

Keywords/Search Tags:

Natural image, Text/non-text image classification, MSER Convolutional neural network, Multi-scale spatial partition

PDF Full Text Request

Related items

1	Scene Text Localization And Recognition Algorithm Research Based On Convolutional Neural Network
2	The Research On Text Identification And Detection Algorithm Of Natural Scene Images
3	Text Localization In Natural Scene Based On MSER And Convolutional Neural Network
4	Study On Text Detection In Natural Scene Images
5	Natural Scene Direction Text Detector Based On Convolutional Neural Network
6	Research On Text Detection And Recognition In Natural Images
7	Research On Multi-features Natural Scene Text Detection Based On Image Enhancement
8	Research On Multi-orientation Text Detection Algorithm In Natural Scene Based On MSER
9	Multi-channel Text Matching Approaches Based On Deep Neural Network
10	The Research Based On Convolutional Neural Network For Text Detection In Natural Scene Images