Font Size: a A A

Image-based Form Recognition Algorithm And Automatic Entry System

Posted on:2019-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:J GuoFull Text:PDF
GTID:2348330542498386Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
As the testing tasks continuously increasing in recent years,many enterprises have been using electronic means in order to manage test data better.However,a large amount of completed test data is still stored in paper form,which requires a lot-of human resources to input test data into the systems.In order to reduce the inefficient manual entry,this thesis studied image-based form recognition algorithm and realized automatic entry of data in the form.The thesis designed a general algorithm to detect the form lines and to locate the cells of the test form,improved the recognition efficiency by training a handwritten digit language library on the test data of the specific test form set,realized a Web-based automatic data entry system and used it to manage the insulator tests.The functions such as image-based test data location,identification and automatic entry were contained in the system.The main work of the thesis is as follows:(1)A form line detection and cell location algorithm was studied.This thesis did some research of image preprocess methods and used the method preprocessing and correcting the original form image obtained by scanning or photographing.Then the thesis used the Hough transform algorithm based on adaptive dynamic adjustment of parameters and second cell locating method to detect the image,locate and extract the cells.(2)A form content recognition method was implemented.This thesis extracted specific test data image samples,trained a handwritten digit language library on the training set and used Tesseract-OCR to identify the experimental data in the cell.The experimental results proved that training a handwritten digit language library on test data of the specific test form set could improve the recognition efficiency of the test data.The correct rate of handwritten numeral recognition was stable at above 92%,and could meet the application requirements of enterprise level with a small amount of manual intervention.(3)Automatic data entry modules were developed.The thesis designed a Web-based automatic data entry system for test data and applied it to manage the insulator tests.The system can receive a photograph or a scan of the corresponding test form,detect the form lines and locate the cells,identify the cell content,which is the test data,and automatically enter the result into the corresponding form in the management platform.
Keywords/Search Tags:form image, form line detection, cell location, character recognition, data automatic entry
PDF Full Text Request
Related items