Font Size: a A A

Design Of Auxiliary Recognition System For Annotation And Translation Of Tangut Literature

Posted on:2022-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2505306347481404Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Tangut script is an ancient script created,used and recorded by the Tangut Dynasty.Interpretation and analysis of Tangut ancient literature is an important method for Tangut studies.Because researchers who are proficient in Tangut characters are scarce,and traditional literature search and retrieval work is timeconsuming and laborious.In response to the above problems,this thesis designs and implements the Tangut document annotation and translation auxiliary recognition system platform.As an auxiliary tool for the interpretation of Tangut documents,it provides convenient assistance such as character recognition and text retrieval for the study of Tangut ancient documents to improve efficiency.The completion of the system design includes The main research contents are as follows:(1)Determine the source of Tangut text data,obtain 98 images of Tangut literature by scanning,and use text extraction algorithm to extract single words.The extracted 16,320 single word samples are divided into 668 categories according to the frequency of use,and a single word sample data set is constructed.Check the Chinese interpretation of the extracted text and other information by referring to the corresponding Tangut literature reference,and establish a MySQL text database.(2)According to the characteristics of Tangut characters,the text recognition convolutional neural network is built,and the sample is expanded on the basis of the extracted single-character sample data set.It is divided into a test set and a training set to train and test the convolutional neural network,and the recognition accuracy is up to 80.16%,save it as a.h5 recognition model to be used.(3)Design and implement a system interaction platform based on the B/S architecture,and use the MTV design model under the Django framework to complete the hierarchical processing of data services.The front end realizes the Web interface by embedding CSS and JQuery in HTML scripts,PHP connects to the database and operates on the data,uses the lightweight data format Json to realize the front and back data interaction,and establishes the Tangut literature annotation and recognition platform.The realization of this auxiliary recognition system is convenient for Tangut scholars and enthusiasts to consult and retrieve the Chinese interpretation,source and context translation of Tangut characters.
Keywords/Search Tags:Tangut script, MySQL, convolutional neural nets, Django framework, JQuery, PHP
PDF Full Text Request
Related items