Font Size: a A A

Research On Character Recognition Of Xixia Ancient Books Based On Optimization Segmentation And Extraction

Posted on:2020-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:X L LiFull Text:PDF
GTID:2415330578977306Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the theory and technology of artificial intelligence are becoming more and more mature,ana the application fields are also expanding.In this paper,artificial intelligence technology is applied to the protection of ancient books,so that it plays its due role in the protection of ancient books.Computer Character Recognition(OCR)is a colmon method for artificial intelligence to identify tangut.It is based on artificial intelligence algorithms to identify digital images of text symbols and convert them into digital text for identifiable purposes.This article takes tangut ancient literature(Golden Light Sutra Recitation I)as an example,research and application of hotspot algorithms in advanced artificial fields to realize automatic recognition of research objects.Tangut is made in the same way as Chinese characters.Its glyphs are more sinular to Chinese characters and there are many strokes.There are many meanings,and there are few pictographs.In the process of identification,there are some difiBculties:1.The character set of tangut is large,the structure is complex,and the similarity between characters is extremely high;2.The handwritten tangut are severely stuck,and image segmentation is more difficult.In the research process,the recognition method of Chinese characters is used to improve the recognition rate of tangut recognition based on artificial intelligence technology.The specific work done in this article is:1.Introduced the research background of tangut,the significance of information protection of ancient books and the research status at home and abroad;2.Preprocessing of ancient books and segmentation of ancient books,including:image binarization of ancient books,mathematical morphology processing,and image segmentation of ancient books combined with edge detection and connected domain analysis algorithms;3.Using the trilinear interpolation algorithm for the feature extraction of the direction gradient histogram after the segmented tangut;4.Based on the identification of the characteristics of the tangut histograms based on the three types of classifiers(SVM,RF?K-NN),the algorithm with the highest experimental recognition rate is obtained.
Keywords/Search Tags:Tangut, Edge detection, Connected region analysis, Trilinear interpolation, Classifier
PDF Full Text Request
Related items