Font Size: a A A

Research And Implementation Of Oracle Radicals And Combined Characters Recognition Based On Deep Learning

Posted on:2022-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LinFull Text:PDF
GTID:2505306530990699Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the source of Chinese characters,oracle bone inscriptions are of great significance for understanding the culture and history of ancient China and even the world.However,the complicated structure of oracle bone inscriptions and many variants hindered the development of oracle bone inscriptions.The recognition of oracle bone inscriptions has always been one of the most important fields in oracle character research.Inspired by the splitting of radicals of Chinese characters,The paper proposes to recognize oracle characters from the perspective of radicals,and designs the methods for recognizing oracle radicals and oracle combined characters.Treating oracle characters as radicals combination instead of whole word recognition can not only reduce the number of oracle characters category and variants,ignore redundant information between similar characters,but also enable the system to recognize unseen oracle character categories.This method can greatly improve the efficiency of experts’ examination and interpretation of unseen characters,and further promote the inheritance and development of oracle bone inscriptions.It has important application value for oracle bone inscriptions research.The specific work of this paper is as follows:1.Since there is currently no standard oracle radical data set at home and abroad,there are few oracle combined characters in the existing oracle font database.In this paper,through data augmentation,character sampling,semi-automatic sampling table cutting and classification,the oracle radical character dataset(ORCD)is constructed with a category number of 15,and a sample number of 10,412;Then use computer-assisted splicing technology to splice the radicals into six common structures of oracle combined character.On this basis,an online handwriting collection system for oracle-combined inscriptions is also designed.Finally,an oracle combined-character dataset(OCCD)is constructed with a category number of 1320 and a sample number of 462186,which expands the oracle combined-character data set that is scarce in the existing oracle font database,and provides the data for the works researching oracle handwriting recognition.2.In the oracle single radical study,this paper designs an oracle radical extract and recognition framework(ORERF)based on deep learning.First,combine the maximum stable extremal regions(MSER)and self defined post-processing algorithm to generate oracle single radical data annotations;Then,input the generated oracle radical-level annotation data into the detection network,which uses the U-Net architecture and attention mechanism to extract single radical features,and then feeds the feature map to the detection module for radical positioning;Finally,according to the coordinate position of the radical,split and input it to the auxiliary classifier network for recognition.The recognition network can solve the problem of multiple single radical variants to a certain extent.3.In the oracle combined characters recognition study,due to image Net is a large data set for visual object recognition research.This paper investigates the fine-tuning of the parameters and structure of the convolutional neural network model pre-trained on image Net data set,so that the convolutional features are more conducive to the representation of oracle combined characters,and reduce the limitation of the amount of training data and the computing time of the model.The experimental results show that the accuracy of the recognition model of this paper on the OCCD data set is 98.4%.4.The automatic recognition system of oracle radicals and combined characters is designed and achieved,which encapsulates the detection and recognition module.The front end of the system is implemented by html5,css3,jquery,bootstrap.In order to facilitate the integration of the model,the back end of the system uses the Django framework and Python language as the development language.The system can automatically recognize the selected oracle image and output it in the foreground.
Keywords/Search Tags:Oracle automatic recognition, Oracle radicals, text detection, MSER, U-Net network
PDF Full Text Request
Related items