| Computer-aided document-handling systems have been widely used. There hasbeen growing interest to recognize and retrieve document images. Optical characterrecognition (OCR) comes into being to do this job. At present, the recognition ratio ofprinted document images has achieved a high level. The segmentation of touching char-acters is becoming an important factor to improve the ratio. In scientific documents,there are many mathematical expressions and these expressions consist of special char-acters with complicated structure. There are two-dimensional position relationshipsamong symbols of these expressions. The recognition of merged characters becomesan important part in recognizing these expressions. To this end, this thesis presents amethod to detect merged subscripts. The thesis is organized as follows.Chapter 1 provides a brief review of neural networks, and illustrates the workflow of mathematical expressions recognition system, which contains the expression'sextraction, recognition and reconstruction. The methods of merged characters detectionand segmentation are also reviewed.Chapter 2 analyzes some features of merged subscripts and presents a new projec-tion method. Based on this, a detection method is given. Firstly, the fringe projectionof character image is fetched. Secondly, the merged subscripts are detected based onthe special projection information. Finally, a simulation is presented which shows thatthe detection method works effectively.For the selection of parameters in the detecting method, a kind of fuzzy neuralnetworks method is discussed in Chapter 3. This method can be used not only to chooseproper value of these parameters by setting them as weights of networks, but also todetect merged subscripts independently.Chapter 4 considers the probability of appearance of merged subscripts inscientific document, and shows the simulation results for some real document images. The simulation is based on mathematic expressions extraction and markups the mergedsubscripts in the source images. |