Font Size: a A A

Key Thchnologies Of Handwritten English Word Recognition For Examination Paper

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:W R CaiFull Text:PDF
GTID:2428330602981510Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with the coming of the information age,the continuous development of computer technology and the rise of artificial intelligence,informatization,digitization and intelligence have become the trend of social development.At present,in the field of examinations,the examination papers are mostly stored in the form of paper or image.In order to facilitate the analysis and preservation of these examination papers,it is necessary to transcribe the data on these papers or images into text data,and then realize the informatization and intelligent processing of these examination papersThe related technologies of handwritten English word recognition for examination papers are the key technologies for examination paper informatization and intelligent processing.In this thesis,we design and implement the word-recognition-related technologies for intelligent processing of examination papers.i.e.,word segmentation,word recognition and post-processing of word recognition.Moreover,these technologies have successfully been applied to the English word recognition of English examination for college entrance examination.The design and implementation of intelligent handwritten English recognition for examination papers is as follows:(1)Word segmentation.Firstly,due to the background area of the image is much larger than the foreground area,and the gray distribution of the two is greatly different,we select the global binary processing method.Then,we combine the projection method and the dynamic line segmentation method,and achieve a new method to segment the text line.Next,we design the projection method processed by mean filtering to perform word segmentation.Finally,we analyze and handle the segmented error.Results have shown that the word segmentation method designed in this thesis is simple and efficient.(2)English word recognition.We design and implement a handwritten English word recognition network based on Seq2Seq.Firstly,we use a Convolutional Neural Network(CNN)with 5 convolution layers designed in this paper extracts the spatial features of the word picture.Then we use the improved multi-layer bidirectional long short-term memory artificial neural network(LSTM)as the encoder to encode the spatial features to the feature codes.Finally,we put the feature codes into the LSTM decoder with attention mechanism to perform decoding,and use Beam Search to expand the decoding range to further improve the recognition accuracy.The English word recognition network designed in this paper has been trained and tested on relevant data sets.The results show that the model designed in this thesis has achieved satisfactory results(3)Post-processing of word recognition.We analyze the common errors in word recognition,then design and implement a method of word recognition error correction based on Bayesian theory.This method corrects the recognition results based on a probability model.Results show that the method partly improves the accuracy of recognition.
Keywords/Search Tags:Word segmentation, Seq2Seq, Attention mechanism, Beam search, Bayesian theory
PDF Full Text Request
Related items