Question Generation Via Using Deep Linguistic Representation

Posted on:2022-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:W Yuan

Full Text:PDF

GTID:2518306725984849

Subject:Master of Engineering (field of software engineering)

Abstract/Summary:

PDF Full Text Request

Question Generation(QG)is a challenging Natural Language Processing(NLP)task which aims at generating questions with given answers and context,meanwhile,the generated question can be answered by the given answers with the context.Recently,with the development of deep learning technology,using neural network to automatically generate high-quality questions becomes possible.Therefore,QG has been attracting more and more attention from the NLP community.In this thesis,we focus on improving QG models’ performance via effectively utilizing text linguistic features,including Named Entity Recognition(NER),Part of Speech(POS),and so on.Most of previous works are based on sequence-to-sequence framework,adopting attention and copy mechanism.Similar to traditional word embedding,these works normally embed linguistic features with a set of trainable parameters,which results in the linguistic features not fully exploited.To solve this issue,we propose to utilize linguistic information via large pre-trained neural models.To be specific,at first,these pretrained models are trained in several specific NLP tasks in order to better represent linguistic features.Then,such feature representation is fused into a seq2 seq based QG model to guide question generation.In addition,we invent a novel linguistic feature customized for QG,namely QAF,which is short for Question Answering Feature.This feature can represent the relationship among the answer,context and question,considering QA and QG are dual tasks,helping generate question with higher answerability.To demonstrate the effectiveness of our approaches,we conduct extensive experiments on two benchmark QG datasets: SQu AD and MS-MARCO.The experimental results show that our approach outperforms the state-of-the-art QG systems,as a result,it significantly improves the baseline by 17.2% and 6.2% under the BLEU-4 metric on these two datasets,respectively.Furthermore,in this thesis,we do lots of case study to analyze the influence of deep linguistic features for question generation.Finally,we propose a universal model’s performance boundaries exploring method: DDS,short for Difficulty-based Data Splitting Strategy,which can estimate model’s best and worst performance on a dataset.Through evaluating model’s performance boundaries,researchers can comprehensively understand their models.

Keywords/Search Tags:

Natural Language Processing, Question Generation, Embedding, Linguistic Feature, Pretrained Model, Model Evaluation

PDF Full Text Request

Related items

1	Answer-Agnostic Question Generation For Social Networks
2	Research Of Natural Language Evaluation Model Based On Hyperbolic Space
3	Research On Natural Language Generation Techniques In The Large Language Model Era Of Deep Learning
4	Paragraph-level Question Generation Based On Variational Attention
5	Knowledge Enhanced Pretrained Model For Personalized Rating Prediction And Explanation Generation
6	Research On Machine Learning For Natural Language Processing And Transmission
7	Research On Question Generation Oriented Exposure Bias Resolution And Multi-object Evaluation Method
8	Research And Implementation Of Generative Question Answering System Technology
9	Research On Related Questions Retrieval Model For Stack Overflow
10	Toward enhancing automated credibility assessment: A model for question type classification and tools for linguistic analysis