Font Size: a A A

Research On Natural Language Watermarking Based On Syntactic Transformations

Posted on:2009-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2178360272992346Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Text watermarking is a very active research direction in digital watermarking and has found many applications in e-commerce, e-government, national security and copy protection, etc. Thus more and more attention and study has been devoted to text watermarking. However, previous document image and format-based approaches could not resist optical character recognition and reformatting attacks, cannot be applied to plain text and these restrict the use in real applications.To deal with these problems, text watermarking scheme based on natural language processing, namely natural language watermarking was proposed. In this scheme, watermark information is embedded into the content of the cover text provided that the original meaning is preserved. As natural language watermarking is regardless of text document formats, it has a wider application prospect. Currently, much research work has been done on natural language watermarking for English texts. However, there has been little study on the Chinese language.In this thesis, based on previous studies, a dependency-based automatic syntactic transformation scheme and a novel syntactic transformation based watermarking scheme are proposed. The proposed schemes employ dependency grammar, transformation analysis, and state-of-the-art natural language processing techniques.In the proposed syntactic transformation scheme, transformation rules are manually collected and represented via dependency relations. Then, using the ideas in paraphrasing and transfer-based machine translation, the thesis proposes a dependency-based scheme and then applies it to generate meaning-persevering sentences which will be used in watermarking embedding process.In the proposed watermarking scheme, sequence permutation algorithm is chosen for sentence selection after comparison with dependency tree sorting algorithm. Based on the study of morphemes distribution in Chinese words, a new Chinese-specific watermark bit carrying approach using sentence weight is proposed. And, to improve embedding accuracy and enhance watermarking robustness, the final scheme also applies sentence grouping approach for watermarking embedding and error-correcting codes for watermark information encoding.Experiments results have shown that the proposed algorithm has a relatively high accuracy, a preferable imperceptibility and robustness, and a satisfying capacity.
Keywords/Search Tags:Information hiding, Text watermarking, Natural language watermarking, Natural language processing
PDF Full Text Request
Related items