Font Size: a A A

Research On Reversible Natural Language Watermarking Based On Synonym Substitution

Posted on:2020-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2518306311483464Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Reversible natural language watermarking is an important branch of natural language watermarking.It aims to embed watermark information in a natural language text carrier by a reversible way.At the same time,after extracting the watermark information,the content of the original carrier can be completely recovered,so as achieving the purpose of copyright protection and lossless content recovery.Aiming at solving the problems existing in existing reversible natural language watermarking methods,this paper focuses on lossless compression and prediction error expansion techniques,and studies the reversible natural language watermarking method based on synonym substitution.The main research contents and results are as follows:1.In order to realize large-capacity watermark embedding and lossless recovery of original text content,a reversible natural language watermarking method based on arithmetic coding and synonymous substitution is proposed.First,by analyzing the relative frequencies of synonymous words,the proposed method quantizes the synonyms employed for carrying payload into an unbalanced and redundant binary sequence.Then,the quantized binary sequence is compressed by adaptive binary arithmetic coding losslessly to provide a spare for accommodating additional data.Finally,the compressed data appended with the watermark are embedded into the cover text via synonym substitutions in an invertible manner.On the receiver side,the watermark and compressed data can be extracted by decoding the values of synonyms in the watermarked text,as a result of which the original context can be perfectly recovered by decompressing the extracted compressed data and substituting the replaced synonyms with their original synonyms.Experimental results demonstrate that the proposed method can extract the watermark successfully and achieve a lossless recovery of the original text.Additionally,it achieves a high embedding capacity.2.To improve the watermark capacity and imperceptibility further,a reversible natural language watermarking method based on context suitability and prediction error expansion is proposed.Firstly,the method uses word embeddings to calculate the similarity between two words,and builds a large-scale synonym database by world's similarities,Secondly,the distances of synonym and its contextual words are be used to estimate the suitability of the synonym in the current context.By setting a threshold,the replaceable synonymous words are filtered and encoded according to their context suitability.Finally,the watermark is embedded by synonym substitution,which are determined by expanding the prediction error of the context suitability.Meanwhile,the substituted synonyms can be recovered to the original ones when the watermark is extracted.The experimental results show that the proposed method not only can effectively extract the watermark and reversibly recover the original text,but also greatly improves the watermark capacity.Even when the threshold of synonym filtering is high,the proposed method still has high watermark capacity and high anti-detection capability.
Keywords/Search Tags:Digital Watermarking, Reversible Natural Language Watermarking, Synonym Substitution, Arithmetic Coding, Prediction Error Expansion
PDF Full Text Request
Related items