A computational framework for adaptive reading in document image understanding

Posted on:1995-08-20

Degree:Ph.D

Type:Dissertation

University:State University of New York at Buffalo

Candidate:Lam, Stephen Wai-Keung

Full Text:PDF

GTID:1478390014989904

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

The task that a reading machine performs is generally known as document image understanding (DIU). DIU refers to extracting relevant information from the digital image of a printed document and converting it into editable symbolic form. It locates regions of interest on the document and derives a logical interpretation for the document layout and content. At present, most of the reading machines in use are custom designed to process some specific types of documents such as bank checks, tax forms, postal mailpieces, etc., but they are limited solely to their assigned tasks and cannot be adapted easily between different documents. However, human reading is an adaptive process which is capable of switching to read different documents easily. This is because human reading is guided by the reader's knowledge and intention of reading. This dissertation is inspired by the facts about the processes of human reading and the current state of the art in DIU. It proposes a computational framework for adaptive reading in DIU. The framework is able to (i) process many different types of document, (ii) classify documents automatically, and (iii) utilize knowledge about documents to guide document image processing activities.; The framework consists of three major components: (i) a knowledge base containing both general and specific document knowledge, (ii) a set of image processing tools specialized for document image analysis, and (iii) a control mechanism utilizing knowledge to direct tools both in object location and recognition. Based on this architecture, adaptive DIU becomes a constraint satisfaction problem, i.e., using image processing tools to extract data from raster images to satisfy constraints defined in the knowledge base. The framework has neither a predefined document-image-processing strategy nor a specific level of content interpretation. Both will be determined by the knowledge about the documents of interest, i.e., the domain knowledge.; In order to validate the framework capability, a system has been implemented by following the framework guidelines. A test set containing four different printed document domains (postal mailpieces, forms, bills, and journals) is used to demonstrate the adaptability of the system. Experimental results have shown the adaptability of the system.

Keywords/Search Tags:

Document, Reading, Framework, DIU, Adaptive

PDF Full Text Request

Related items

1	Research And Application Of Visualization Guided Document Reading
2	A framework for quick RFID tag reading in dense environments
3	The Design And Implementation Of Document Transmission In ERP System
4	Automatic Detection Of Deep Reading And Shallow Reading Behavior Within-document By Using Eye-movements Data Analysis
5	A Study On Reading Difference Between Different Reading Medium
6	Research On Efficient Document Clustering Using Improvised Sub-Document Based Framework
7	A Research Of Auxiliary Reading Behavior Patterns Within The Document Based On Eye Movements Analysis
8	A Document Interoperation Framework on the Semantic Web (DIFSEW)
9	Research On Summarization Of Book Chapter Based On Reading Behavior
10	Design And Implementation Of Document Flow System