Font Size: a A A

The Study On Utilizing Region-labeling Automa In Document Iimage Analysis

Posted on:2006-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:D Q WangFull Text:PDF
GTID:2168360152492902Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
Document image analysis is a key part of the Optical Character Recognition (OCR) system. Based on the author's deep study on Vertex Chain Code, a new method for document image skew rectification, layout analysis and geometrical features extracting for connected regions is designed and realized. Region-labeling-automata is a technique that can be used to label object regions and accordingly generate the vertex chain codes for a digital image. This technique is for the first time utilized in document image analysis. The innovation of this paper includes using vertex chain code for preceding each region's Minimum Enclosing Rectangle (MER), extracting text lines from a document image, detecting image skew and analyzing the paper layout. The result of multiple experiments has testified the feasibility and efficiency of the algorism.
Keywords/Search Tags:OCR, vertex chain code (VCC), regionlabelingautoma, skew rectification, Robust regression, layout analysis, layout recognition, layout understanding, run_length smoothing, connected region, minimum enclosing rectangle (MER), compact, posture ratio
PDF Full Text Request
Related items