Font Size: a A A

Research And Implementation Of Handwritten Ancient Book Text Detection And Tampering Detection

Posted on:2024-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:2568307106490134Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Handwritten ancient books are one of the most important sources of documentary research,and are rich in historical,cultural,and academic value.However,during the process of transmission,these ancient books are vulnerable to destruction or loss due to natural disasters,vandalism,and other factors.Therefore,the digitisation of ancient books has become an urgent task.For researchers,manually classifying and transcribing these ancient books takes a lot of time and effort.However,the application of optical character recognition technology can enable faster and more accurate recognition and transcription of handwritten ancient books,as well as provide data support for the development of an ancient book database platform.Furthermore,the authenticity of images of ancient texts is often questioned,and there are numerous cases of tampering with images of ancient texts used as arguments in academic research.This not only compromises the integrity and authenticity of ancient texts but also has a serious negative impact on the research and cultural transmission of ancient texts.Therefore,conducting research on the detection of tampering with images of ancient texts will ensure the authenticity and credibility of the study of ancient texts,as well as promote further development in the digitisation of ancient texts.The purpose of this thesis is to investigate problems related to the detection of handwritten ancient texts and the detection of tampering in images,using attention mechanisms.The focus is on exploring the effects of attention mechanisms and multiscale feature fusion to propose more robust models for text and image tampering detection.The main research work and contributions of this thesis are as follows:(1)Although the digitisation of ancient books continues to advance,the study of ancient books is relatively limited,resulting in a scarcity of well-annotated ancient book datasets that can be used for deep learning,particularly in the case of calligraphic works.However,calligraphy is an essential component of Chinese culture,and it can help individuals develop their inner cultivation and temperament while also promoting the values and spirit of Chinese culture.As a result,in collaboration with the Institute of Calligraphy of Southwest University,this thesis constructs a dataset of Su Shi’s calligraphic works and develops a set of annotation rules based on the dataset’s characteristics,which are annotated at the character level.This dataset serves as a benchmark dataset for detecting handwritten ancient books,providing data support for the subsequent research while also advancing the digitisation process of ancient books.(2)Most of the existing methods for detecting ancient texts are applicable to printed or handwritten ancient texts with standardized arrangements and uniform styles.However,these methods do not generalize to images of calligraphic works with complex layout styles,severe noise interference,and intricately arranged fonts.In particular,it is difficult for existing detection methods to identify character edges in calligraphic works with hyphenated fonts.To address these problems,this thesis proposes a multi-scale detection model,MA-DBNet,based on the characteristics of handwritten antique books.The model introduces a cavity convolution in the traditional Res Net backbone network to expand the perceptual field and proposes a locally updatable attention mechanism for the initial fusion of features.Additionally,the model performs feature fusion using null convolution,instead of normal convolution,in the spatial attention mechanism of the adaptive scale fusion module.This allows the model to better focus on fine-grained features.Furthermore,experimental results on the Su Shi calligraphy works dataset also demonstrate that the MA-DBNet model shows strong robustness in detecting handwritten ancient books,especially for continuous script.(3)In the field of antiquarian scholarship,it is common to encounter differing views on the same historical material.However,some individuals resort to tampering with images of ancient texts in order to support their own perspectives.Such behavior not only misleads researchers but also undermines the credibility of historical documents and archival materials,thereby impacting social credibility.Therefore,this thesis presents the first dataset of ancient book image tampering,created using PS technology to manipulate characters and seals in ancient book images.Based on this dataset,the thesis proposes a new antique text image tampering detection and recognition model called MDAS-Net.The model introduces a novel feature fusion approach in the edge-supervised branch and incorporates an E-2-N/N-2-E Help Block for feature fusion in both branches.Finally,experiments show that the MDAS-Net model has good performance in both the ancient book image tampering dataset and TTI dataset.(4)In the field of calligraphy research,researchers at the Faculty of Arts often use’PS keying’ to extract images of individual characters from ancient texts.They then have to undertake the tedious task of collating and classifying these images before they can produce a calligraphy dictionary.In response to this issue,this thesis presents a handwritten text detection and cutting system,incorporating an improved handwritten text detection model and a tampering detection model.The development of this system not only enhances the efficiency of dictionary compilation at the Faculty of Arts and Letters but also provides strong support for the digitization of ancient books.Furthermore,it contributes positively to the preservation and transmission of Chinese cultural heritage.
Keywords/Search Tags:Antiquarian datasets, Handwritten text detection, Text image tampering detection, Deep learning, Attention mechanism
PDF Full Text Request
Related items