Multi-font Chinese Character Recognition Based On Stroke Segmentation

Posted on:2024-09-08

Degree:Master

Type:Thesis

Country:China

Candidate:J Shu

Full Text:PDF

GTID:2545306935499614

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Chinese characters are widely used in people’s daily lives as a carrier of information expression.With the continuous development of computer technology,Optical Character Recognition(OCR)technology has made breakthrough progress.Chinese character recognition methods based on modern deep neural networks can automatically extract helpful high-level semantic features from images to complete recognition.Its end-to-end training method dramatically improves the model’s accuracy and can effectively counter external inferences with good robustness,such as exposure,rotation,and scaling.However,such methods regard Chinese character recognition as a multi-classification problem and use Softmax to directly output the predicted characters’ probability distribution.When there are many characters in the character library,the computational cost of Softmax is high,and both the training and use of the model require a significant increase in computational cost.On the other hand,the dimension of Softmax corresponds one-to-one with the number of characters in the character library,which means that once the character library is determined,the characters that the model can recognize are also determined.If we want to expand the character library,even if we only add one character,we must retrain the network from scratch.In addition,the training data needs to cover every character in the character library.The more extensive the character library,the more complex the network and the more training data it requires.In order to solve these difficulties,this thesis takes the deep learning theory as the starting point,takes the most basic information of Chinese characters’ stroke structure as the starting point,realizes the task of Chinese character recognition by the way of stroke matching,and introduces the method of style transfer to cope with the problem of font change.The main work contents are summarized as follows:(1)This thesis proposes a new stroke segmentation extraction method that solves the problem of stroke overlap caused by intersection points in traditional stroke segmentation extraction methods.Moreover,this model can achieve stroke segmentation extraction for complex traditional Chinese characters using a small number of simplified Chinese characters.The stroke segmentation extraction accuracy on the regular script Chinese character datasets can reach 94.47%.(2)This thesis proposes a Chinese character style transfer algorithm based on the Chinese character skeleton and attention mechanism.It solves the problems of missing content and inaccurate style transfer in existing Chinese character image style transfer.In the overall tool development,it effectively improves the generalization performance of the stroke segmentation model,making it applicable to different styles of font types.Meanwhile,the unsupervised style transfer method greatly reduces the cost of building stroke datasets.(3)A deep learning based text recognition tool has been developed.Users can upload document images to be recognized locally,and the tool also integrates stroke segmentation extraction and style conversion modules.Each module in this method is independent of the other,so when new Chinese characters need to be added,there is no need to collect data and retrain the model.Only the stroke information of the new Chinese characters needs to be added to the database.This effectively reduces the cost of development and maintenance of the recognition tool,and improves the accuracy of character recognition.

Keywords/Search Tags:

deep learning, GAN, CNN, style transfer, stroke segmentation

PDF Full Text Request

Related items

1	The Style-transfer Of Chinese Character Based On Deep Learning
2	Study On Pencil Drawing Style Transfer Method Based On Deep Learning
3	Chinese Painting Landscape Style Transfer Based On Deep Convolutional Neural Network
4	Design And Implementation Of Painting Art Style Rendering System Based On Deep Learning
5	Research On Oil Painting Style Generation Based On Deep Learning And Stroke Techniques
6	Research On The Integrated Processing Technology Of Sentence Segmentation And Lexical Analysis Of Ancient Texts Based On Deep Learning
7	Research On Image Style Transfer Algorithm Based On Generative Adversarial Network
8	Research On Content And Style Recognition Of Calligraphy Characters Based On Deep Learning
9	Research And Application Of Chinese Calligraphy Style Transfer For Cultural Relics Restoration
10	Stoke Segmentation And Application Of Calligraphy Combining VDSR And Graph-cut