Font Size: a A A

Research On Machine Reading Comprehension Based On Pre-Trained Language Model

Posted on:2022-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y BaoFull Text:PDF
GTID:2518306779975779Subject:Library Science and Digital Library
Abstract/Summary:PDF Full Text Request
Machine reading comprehension is a hot research direction in the field of natural language processing,which aims to predict relevant answers according to articles and questions.Its research content is closer to practical application,and the accuracy of prediction can better reflect model performance.The pretraining language model can provide a better initialization model and accelerate the convergence of the target task by using the existing large amount of data and learning the general language representation.At the same time,it can also avoid the phenomenon of over-fitting in small sample data set.The thesis mainly conducts in-depth research on machine reading comprehension tasks,and the specific work is as follows:(1)In order to solve the problem that the existing extractive machine reading understanding model cannot deeply capture the global semantic information and the deep information interaction between articles and questions,the thesis constructed a multi-layer perceptron pre-training finetuning paradigm based on MLM as Correction Bert(MacBERT).This method can solve the polysemy problem which cannot be handled by traditional static word vector to a great extent.Meanwhile,multi-layer Transformer structure can better capture the semantic relationships between long texts,which has great advantages for machine reading tasks such as long text comprehension.By replacing traditional masks with similar words,Mac BERT makes it fit the essence of machine reading comprehension,solves the problem of inconsistent pre-training and fine-tuning existing in traditional paradigms,and predicts the answers of interactive information through multi-layer perceptron,which improves its performance compared with traditional models to some extent.In the SPAN-extraction problem of THE CJRC dataset,the F1 value of the model is 82.8% for civil cases and 79.8% for criminal cases.(2)In order to adapt to the corresponding characteristics of different types of problems,different pre-training fine-tuning paradigm is proposed.Firstly,for fragment extraction problems,the bidirectional short and long duration memory network is used to capture global information,so that the model can learn the interaction information from article to question and feature from question to article,and learn related feature representation at a relatively long distance.This method is suitable for fragment extraction problems with large answer fragment span.Second,for the YES/NO problem,the self-attention mechanism is used to fully mine the fine-grained word cues between texts.Third,for an Unanswerable question,use a special identifier [CLS] in the pretrained model that contains the full sentence vector information.For civil cases in the CJRC dataset,the F1 value of the model is 84.7%.In criminal cases,the F1 value of the model is 82.3%.Compared with the traditional bidirectional Attention Flow(BiDAF)model,the F1 values of civil cases and criminal cases are improved by 23.6 percentage points and 19.6 percentage points respectively.Compared with the Mac BERT model without the network layer designed by this experiment,the F1 value of civil cases increased by 3.5percentage points,and the F1 value of criminal cases increased by 4.9percentage points,which verified the effectiveness of the multi-type machine reading comprehension method based on the pre-training model.
Keywords/Search Tags:Extractive machine reading comprehension, Pre-training models, Deep learning, Multi-type problems
PDF Full Text Request
Related items