Font Size: a A A

Design And Implementation Of Retrieval Open Domain Question Answering System Based On Tarles

Posted on:2024-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z H ZhangFull Text:PDF
GTID:2568306914982529Subject:Electronic Information (Computer Technology) (Professional Degree)
Abstract/Summary:PDF Full Text Request
Open domain question answering is a popular research orientation in the field of natural language processing,it has various applicable scenarios such as voice assistants and intelligent search.In open domain question answering,questions are not limited to a particular domain.Therefore,how to retrieve the most likely document containing the answer from a massive knowledge base is crucial to improving the accuracy of answer prediction.Tabular data is one of the most abundant,easily obtainable and processable structured data on the Internet.It has advantages such as clear structure and high timeliness,which makes it suitable as storage form of knowledge base in open domain question answering systems.In summary,open domain question answering model over tabular data is of great significance in practical.In this thesis,we construct an open domain question answering system over tabular data in Retriever-Re-ranker-Reader three-stage framework.The main work is as follows:(1)For the Retriever,we encode question and table with pre-trained language model and construct a dense retrieval model based on two-tower model.Meanwhile,we optimize the sampling process by using a hard negative sampling method based on sampling limitation,which restricts the feature distribution of samples in same mini-batch,thus making them hard negative samples to each other.Experiments prove the effectiveness of the sampling method we proposed.(2)For the Re-ranker,we construct a cross-encode model between question and table and calculate the matching degree based on its output feature.On top of baseline model,column-level correlation matching task is added to construct a multi-task matching model,which further improves the performance of the Re-ranker.(3)We construct a Text-to-SQL model,which converts the question into structured query language for the corresponding table and extracts the answers.We integrate the various modules mentioned above,design and implement a complete open domain question answering system over tabular data.Each module is tested,the result proves that the system functions well and meets the design requirements.
Keywords/Search Tags:natural language processing, open domain question answering, retrieval model, negative sampling, table question answering
PDF Full Text Request
Related items