Research On Steganography Based On Generative Text

Posted on:2024-08-24

Degree:Master

Type:Thesis

Country:China

Candidate:Y Huang

Full Text:PDF

GTID:2568307067991519

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

With the rapid development of the Internet,information security has received more and more attention.The maturity of natural language processing technology and the im-provement of hardware devices have made generative linguistic steganography models a new research hotspot.At present,linguistic steganography algorithms mainly face problems such as low quality of generated text,small embedding capacity and large differences with natural text statistical distribution,but with the advancement of lan-guage model technology and the improvement of a series of conditional probability mapping coding,the above problems have been continuously improved.However,at present,some natural language processing technologies are still insufficient,not suitable for discrete data such as text,and further research is needed to analyze the connection between various goals and find better ways to achieve a balance under different demand goals.Since the traditional generative adversative network(GAN)is not suitable for text data and is more suitable for text generation when combined with large language models,this paper proposes a TransGAN model for generative text steganization,which uses GPT-2 model for generator and BERT model for discriminator.At the same time,a new loss function is designed to optimize the generator,and the loss function proposed in this paper retains the advantages of MaliGAN,which not only allows the model to continue to find the global optimal solution,but also appropriately reduces the accuracy influence caused by discrete variables.In the steganography part,this paper performs arithmetic encoding of words based on conditional probability distribution,compares and finds that dynamic arithmetic coding is superior to static arithmetic coding,and theoretically proves the imperceptibility and data compression invariance of arithmetic coding.Finally,according to the experimental results,the proposed model is better than the steganography model based on generative adversarial network(GAN)in terms of KL divergence and embedding capacity,and the difference in perplexity is small,and a safer steganography method is realized.

Keywords/Search Tags:

Linguistic steganography, Natural language processing, TransGAN, GPT-2, BERT

PDF Full Text Request

Related items

1	Research On Generative Natural Language Information Hiding Based On Deep Learning
2	Natural Language Analysis And Steganography Steganography Amount Detection
3	The Research On Natural Language Information Hiding
4	Distilling Bert-based Model For Natural Language Understanding
5	Research On Text Steganography
6	Task-Adaptive Compression Method For BERT Via Truncation Before Fine-Tuning
7	Study Of Chinese And English Text Steganography Based On Typos
8	Research On Code Retrieval Technology Based On Extended Query And Natural Language Processing
9	A Research On Abstract Summary Extraction Of Long Texts Based On BERT Model
10	Research On Analysis And Design Of Linguistic Steganography