A Method For Detecting Merged Subscripts In English Scientific Document

Posted on:2008-12-16

Degree:Master

Type:Thesis

Country:China

Candidate:G W Zhang

Full Text:PDF

GTID:2178360218955285

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

Computer-aided document-handling systems have been widely used. There hasbeen growing interest to recognize and retrieve document images. Optical characterrecognition (OCR) comes into being to do this job. At present, the recognition ratio ofprinted document images has achieved a high level. The segmentation of touching char-acters is becoming an important factor to improve the ratio. In scientific documents,there are many mathematical expressions and these expressions consist of special char-acters with complicated structure. There are two-dimensional position relationshipsamong symbols of these expressions. The recognition of merged characters becomesan important part in recognizing these expressions. To this end, this thesis presents amethod to detect merged subscripts. The thesis is organized as follows.Chapter 1 provides a brief review of neural networks, and illustrates the workflow of mathematical expressions recognition system, which contains the expression'sextraction, recognition and reconstruction. The methods of merged characters detectionand segmentation are also reviewed.Chapter 2 analyzes some features of merged subscripts and presents a new projec-tion method. Based on this, a detection method is given. Firstly, the fringe projectionof character image is fetched. Secondly, the merged subscripts are detected based onthe special projection information. Finally, a simulation is presented which shows thatthe detection method works effectively.For the selection of parameters in the detecting method, a kind of fuzzy neuralnetworks method is discussed in Chapter 3. This method can be used not only to chooseproper value of these parameters by setting them as weights of networks, but also todetect merged subscripts independently.Chapter 4 considers the probability of appearance of merged subscripts inscientific document, and shows the simulation results for some real document images. The simulation is based on mathematic expressions extraction and markups the mergedsubscripts in the source images.

Keywords/Search Tags:

Merged Subscript, Fringe Projection, Fuzzy neural networks, Formula extraction, Merged characters segmentation

PDF Full Text Request

Related items

1	Extraction, Recognition And Reconstruction Of Mathematics Formulas In English Scientific Document
2	Research On Key Issues Of Printed Mathematical Formula Recognition
3	Web Authentication Code Generation And Recognition
4	Research And Implementation On Merged Content Distribution Management System Toward 4K Streaming Media
5	Latch-up prevention and modeling of merged bipolar-MOS structures for BiCMOS applications
6	Structure analysis and modeling for a merged BIPMOS device
7	Research On The Method Of Change Domain Based On Merged Model Of Petri Net Configuration
8	Research And Application Of Deep Learning Methods In Key Technologies Of Phase Extraction In ESPI And FPP
9	Research On Path-oriented Merged Strategy Of Hybrid Detection
10	A Study Of Algorithms Of Color Image Compression Using Wavelet Transform