Benefiting from the rapid development of deep learning in the last few years,the amount of research work on text generation has gradually increased,and advancements have been made in some aspects.Text generation methods have also been applied to more and more scenarios,the practical value brought by which has become increasingly prominent.Although deep learning technology has brought great convenience to text generation and been widely used,it still has the following challenges in the face of semantic constraints of different granularities:(1)under the word-based discrete semantic constraints,how to deal with the one-to-many mapping relationship between the discrete words and the generated target;(2)under the structural semantic constraints that fuse the relationship between words,how to represent the structural relationship between words in the generation model;(3)under the sentence-level semantic constraints based on short text,how to ensure the semantic consistency between the source sentence and the generated target;(4)under the paragraph-level semantic constraints based on long text,how to achieve semantic association between the generated target and the fragments of long text.The dissertation mainly conducts research on the problems mentioned above,and the main research contents include the following aspects:Firstly,to solve the problem of diverse representation of discrete semantics in the process of text generation,the method for generating diverse questions under discrete semantic constraint is proposed.The method first adopted Transformer as the base model.And next,the previous generated questions and text information were concatenated together as input during decoding,which can not only ensure the generated questions not deviating from the topic,but also made the current generated question sufficiently different from the previous ones.Finally,a trainable control signal was introduced during decoding to perform representation learning on the common features of each type of question,which further ensured the diversity of generated questions.The dataset was constructed and the data source was from Baidu Zhidao.Experimental results demonstrate that the proposed diverse question generation method is significantly better than other baseline methods in terms of relevance and diversity,which proves that historical information and control signals can improve the performance of the model.Secondly,to solve the problem of representation learning of structured semantic relations in text generation models,the text generation method under structual semantic constraint is proposed.The method first used the bidirectional encoder based on gated recurrent unit to encode the topic words.And next,it adopted the encoder based on the multi-head self-attention mechanism to encode the knowledge graph,and incorporated the adjacency relationship of the nodes into the calculation of the attention mechanism,making the correlation between entities more clear.Finally,the representations of topic words and knowledge graph were jointly fed into the decoder to generate target texts.The dataset was constructed based on Chinese medical literature and the proposed model was validated on it.Experimental results show that utilizing knowledge graph can help generation models improve performance,and also demonstrate that modeling the overall structure of knowledge graphs can further enhance model’s performance.Thirdly,to solve the problem of the semantic consistency between the sentencelevel semantic representation and the generation target,the structured query language generation method under sentence-level semantic constraint is proposed.The method first performed entity linking over the given text and utilized a pre-trained language model to the encode the linking result.And then,according to the structural characteristics of SQL,the representation of text question was translated into an abstract syntax tree,which was an intermediate representation of the target structured query.Finally,the intermediate representation of abstract syntax tree was converted into an executable query based on the grammatical rules.The proposed model was validated on the publicly available medical text-to-SQL dataset MIMICSQL.Experimental results demonstrate that the proposed text-to-SQL method is far superior to other baseline methods in various metrics including logical form and execution accuracy,which proves the effectiveness of entity linking and abstract syntax tree.Fourthly,to solve the problem of semantic reasoning between the paragraph-level semantic representation and the generation target,the complex question generation method under paragraph-level semantic constraint is proposed.The method first used an encoder based on a gated selection mechanism to encode the given text and the abstractive answer,respectively.And then,a pretrained model was used to predict the question intent based on the correct answer,in which the intent representation was taken as the initialization of the decoder.Finally,it adoted the attention mechanism to fuse long text and answer together to generate questions.The proposed model was validated on publicly available machine reading comprehension dataset.Experimental results demonstrate that the proposed intent-aware complex question generation method is superior to other baseline methods in various evaluation metrics,proving that intent information can effectively help question generation models improve performance.To sum up,this dissertation conducts in-depth research and discussion of text generation methods under semantic constraints in different granularity.Aiming at the four existing problems of the generation tasks,the dissertation proposes a diverse question generation method based on keywords,a text generation method based on knowledge graph,a structured query language generation method for short text and a complex question generation method for long passage.By conducting a lot of experiments,all the proposed methods are verified on their corresponding datasets,and finally achieve promising performance. |