| As the main means of covert communication and privacy protection in the field of information security,information hiding ensures the security of information by hiding the existence of secret information,which is an important and challenging research direction in the field of information security.Information hiding has attracted extensive attention of scholars at home and abroad.In view of the lack of general theoretical framework in the current research of generative natural language information hiding,the poor quality of long steganographic text and the limited actual security embedding capacity caused by short steganographic text,this paper deeply studies the linguistic steganography based on deep learning and text generation technology.The main research results are as follows:(1)In order to improve the embedding capacity of generative linguistic steganography method,a character level linguistic steganography method based on recurrent neural network(LSTM)is proposed.Firstly,this method uses LSTM model and large-scale corpus to construct and train character level linguistic generation model.Then,the trained character level linguistic generation model is used to predict the probability distribution of the next character.The candidate words with high prediction probability are encoded into different values,and then the selection of generated characters is controlled according to the bit value of secret information to be embedded and the coding value of candidate words to realize the embedding of secret information.Finally,this method designs a selection strategy to improve the quality of steganographic text.For each secret information,multiple candidate steganographic texts are generated by changing the starting string.The text quality is evaluated according to the designed strategy,and the candidate steganographic text with the best quality is selected as the final steganographic text.The experimental results show that this method has the highest running speed compared with other similar methods.In addition,with the increase of the number of candidate steganographic text,the quality of the final steganographic text is improved,but the growth rate will slow down with the increase of the scale of candidate steganographic text.(2)In order to improve the quality of generative steganographic text and the universality of constrained generative linguistic steganography,this paper first presents a general sequence to steganographic sequence framework composed of linguistic encoder and steganographic encoder--—Sequence to Steganographic Sequence Model.Afterwards,taking the summary generation as the applicable scenario,we design a multi-candidates-based dynamic steganographic encoding method at the steganographic encoder,then proposes a novel constrained generative linguistic steganography.The linguistic encoder of the proposed steganography introduces the copy mechanism and attention mechanism into Bi-LSTM to encode the text information,and carries out adaptive dynamic grouping encoding for multiple candidate generation sequences during the process of decoding and generating the summary text at the steganographic encoder,so as to generate high-quality steganographic summary text under information control.The framework based on sequence to steganographic sequence can completely realize the transformation from original constraint information to stegotexts.Experimental results show that the improvement of the linguistic encoder or steganographic encoder under sequence to steganographic framework can ensure that the steganographic summary text has better linguistic concealment,and even the quality of the steganographic summary texts generated by the proposed method under different conditions is better than that of the natural texts under the basic encoder. |