Font Size: a A A

Research And Application Of Scene Text Recognition And Information Extraction

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:H J YangFull Text:PDF
GTID:2428330611962509Subject:Engineering
Abstract/Summary:PDF Full Text Request
Character is an important medium of information for people,and the use of computers to recognize word in images has always been the focus of the study of pattern recognition.With the development of electronic and information technology,people can use digital devices such as mobile phones to capture scene images that contain much of none-text texture and noise in various environments.So that how to recognize text and extract interesting information in scene images to analysis and transmit is become a hard and critical problem to researchers.Thanks to the development of deep learning in field such as target detection,optical character recognition(OCR)technology has also made great progress in methods and performance in recent years.However,traditional OCR method is not understandable to the semantic information of text,the result of which is usually a string of editable text with artifacts,so it is difficult to extract the text of interest from this text,which is very limited in practical scenario applications.In response to this problem,the research content of this paper is divided into the following three aspects:First,in view of the complex situation of natural scene text image,a set of text line image generation algorithm based on image rendering process is designed to efficiently generate a large number of labeled text line images for subsequent text recognition models for training,and for the sequence labeling problem.A rule-based data generation algorithm is proposed to directly generate a labeled text sequence for subsequent sequence labeling model training.Second,in order to deal with the problem of scene text recognition,it is proposed to adopt ResNet(Residual neural network)as the backbone frame of the scene text recognition model,and compare it with VGGNet(Very Deep neural network)as the model backbone.Meanwhile,for better adapt to the Chinese identify the scene,we propose a text recognition model which adopt the space transformation algorithm based on Thin-plate spline,i.e.,TPS interpolation,and achieve 98.13% accuracy on the test data.Third,a BiLSTM-CRFs model based on a circular neural network is proposed.We model the identification result sequence of OCR using bidirectional long short-term memory(BiLSTM)network to obtain a sequence of features containing contextual information.And then,we introduce a conditional random field(Condition random field,CRF)establishes the relationship between features and labels for label predictions,and obtaining specific text through labels.The experimental results show that the method can achieve 88.52% accuracy on the scene image data set of YNIDREAL(scene id card images)provided by Yunnan Puer Electricity Board.Compared with the conditional random field model,the accuracy was increased by 16.39%,which proved the feasibility and robustness of the method.
Keywords/Search Tags:Scene text recognition, Information Extraction, Data Generation, Bidirectional Long Short-term Memory Network, Condition random field
PDF Full Text Request
Related items