| Question generation task refers to the automatic generation of natural language questions based on the given text and answer.The application prospect of the question generation task is very broad.In the field of human-computer interaction,chat robots(Siri,Microsoft Xiaoice,etc.)can have a dialogue with users or ask for user feedback by asking questions,and good questions can bring better experience to users.In the field of education,generating targeted questions based on course materials can test the level of students,understand the degree of students’ mastery of knowledge,promote students’ self-examination,and reduce teaching pressure.In addition,as a dual task of automatic question answering,the question generation task can enhance the performance of question answering models by generating a large number ofhigh-quality questions to provide large-scale datasets for training question answering models.At present,methods based on pre-trained language models have achieved excellent results on the task of question generation.However,they cannot effectively utilize the syntactic dependency information of the source text.Since these relationships are all well-defined symbols and are in a different semantic space from the words that have been trained by the pre-trained model.If the pre-trained language model is directly used for encoding these syntactic dependency relationships,there will be a semantic gap,which hinders the further improvement of the question generation effect.In addition,there are few existing works on complex question generation,and these methods usually build a semantic graph from a passage,ignoring the multi-hop information contained in a single sentence.A sentence often contains multiple facts,and the semantic graph constructed based on the passage level cannot guarantee the correctness of the facts in the generated question.In order to solve the above problems,this thesis proposes two end-to-end question generation models and conducts extensive experiments on large-scale datasets to verify the effectiveness of the models,as follows:1.Aiming at the problem that the existing question generation methods based on pre-trained language models cannot effectively utilize the syntactic dependency information of the source text,this thesis proposes a sentence-level question generation model based on syntax-aware prompts.The model uses a relation-aware attention graph encoder to encode the syntactic dependency information of a given sentence.At the same time,the syntactic dependency vector is fused into the pre-trained model using continuous prompt learning,and it is jointly represented with the source sentence.In addition,in the decoding process,a syntactic-aware coverage mechanism is designed to solve the problem of the generated word repetition,which effectively improves the quality of the generated question.2.In order to solve the challenge that current complex question generation methods ignore the multi-hop information contained in a single sentence,this thesis proposes a passagelevel complex question generation model based on fine-grained semantic graph enhancement.This model constructs a semantic graph separately for each sentence sequence in the given passage,and then uses a multi-head attention-based graph encoder to obtain the local vector representation of fact triples and the global vector representation of each semantic graph,respectively.At each time step of decoding,the attention mechanism is used to select the semantic graph and the fact triples in it that need to focus on.By incorporating this information,the model can enhance the factual correctness and complexity of the generation problem,and assist the generation of the current word.3.Extensive experiments are conducted to verify the effect of our question generation models proposed in this thesis.By designing reasonable comparative experiments on two large-scale reading comprehension datasets,and comparing with the existing state-ofthe-art methods,the effect of our models in this thesis is verified.The experimental results demonstrate that the proposed methods in this thesis can more accurately generate questions that can be answered by answers,and improve the factual correctness of the questions. |