Dunhuang documents are one of the precious heritages of Chinese culture.Their early production,poor preservation conditions,and irregular writing result in low image quality,stained documents,and sticky text,making it difficult to perform accurate text recognition.Pre-processing operations can improve the accuracy of text recognition and provide more reliable support for digitizing Dunhuang Tibetan documents by reducing the interference of redundant information and highlighting important feature information to obtain clearer and more accurate image information.Earlier pre-processing explorations of ancient Dunhuang Tibetan literature identification mainly used traditional methods,such as undifferentiated denoising and color enhancement of the entire image.However,for special documents such as Dunhuang documents,these methods no longer meet their digitization needs.Pre-processing research on Dunhuang ancient texts using deep learning methods is one of the indispensable techniques to address the recognition challenge.In this paper,we analyze and summarize the difficulties in document recognition of Dunhuang Tibetan documents,and conduct preprocessing research based on the classification of difficulties to find preprocessing methods applicable to Dunhuang Tibetan documents.The main work of this paper is as follows:(1)Analyze and summarize the difficulties in document identification of ancient documents in Dunhuang Tibetan literature.The three main difficulties affecting the text recognition of ancient documents in Dunhuang Tibetan literature are summarized in conjunction with earlier research results of scholars: the low quality problem of documents,the image stain problem,and the text adhesion problem,and the full paper focuses on the typical recognition difficulties of Dunhuang Tibetan literature for preprocessing research.(2)Construction of Dunhuang Tibetan literature antique dataset.To address the three main factors affecting the identification of Dunhuang Tibetan literature,namely low quality,image stains,and text adhesion,we select representative original images from the miniature electronic images of the Dunhuang Tibetan Literature Collection of the National Library of France and construct four corresponding Dunhuang Tibetan literature datasets as the basis of our experiments.The following research results were achieved in this paper:(1)A pre-processing study of the low quality of Dunhuang Tibetan literature was completed.Res Net is used as the backbone network,and two traditional preprocessing methods,Gaussian filtering and maximum interclass variance method,are incorporated into the image reconstruction module to form the Deblur Net preprocessing model.Its effectiveness was verified by ablation experiments and comparison experiments on a selfconstructed low-quality dataset of Dunhuang Tibetan literature: the images preprocessed by the model showed a 15-17 percentage point reduction in character recognition error rate over the original images.(2)A study on the pre-processing of stained images of Dunhuang Tibetan documents was completed.Using the Vi T model as the core network,two traditional preprocessing methods,Laplace and segmented linear grayscale transform,were incorporated and the network model was improved to obtain a double-defacement preprocessing model.The effectiveness is verified by demonstrating 16.07% and 8.81% improvement in recognition accuracy on the self-built Dunhuang Tibetan literature ancient books lightly and heavily stained image datasets,respectively.(3)A pre-processing study of text adhesion in Dunhuang Tibetan literature was completed.Using the Attention U-Net network as the base network,we added the traditional preprocessing combination method and the threshold fusion module to obtain a preprocessing scheme J-Net segmentation network applicable to the text-sticky layout of ancient Dunhuang Tibetan literature.The recognition results on the constructed text-adhesive image dataset of Dunhuang Tibetan literature show that the text-adhesive images preprocessed by this network model improve the image recognition accuracy by 15.77% over the original dataset. |