Font Size: a A A

Research On Deep Learning Algorithms For Text Classification And Question Answering

Posted on:2022-07-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y DaiFull Text:PDF
GTID:1488306728465504Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural language understanding is an important research sub-field of natural language processing,and it is also an intersecting research field of computer science,cognitive science,and artificial intelligence.Natural language understanding hopes that computers have the ability of human beings to understand the emotions,latent semantics,and other comprehensive information contained in the language,and then infer human intentions.To this end,researchers have carried out research on different application scenarios and tasks.The research tasks include part-of-speech tagging,named entity recognition,sentiment analysis,topic modeling,information retrieval,question answering,etc.In recent years,the research of natural language understanding has shown the following characteristics: At the technical level,deep learning theories and techniques have made breakthroughs in various fields and downstream tasks,making the research work on natural language understanding shifted from the original feature engineering to the current deep learning model and algorithm design engineering? at the task level,most tasks are related to text classification and information retrieval.For example,part-of-speech tagging,named entity recognition,sentiment analysis,and entailment can be transformed to text classification of different granularities.Tasks such as information extraction and question answering are strongly related to information retrieval? at the data level,natural language understanding can be divided into supervised tasks,unsupervised tasks,and comprehensive tasks according to whether the training data has supervised information.Among them,unsupervised tasks and comprehensive tasks are particularly challenging.Based on deep learning algorithms,this thesis takes classification and question answering tasks as the research topics,and takes the supervised task,the unsupervised task and the comprehensive task as the main research lines.Innovative contributions of this thesis can be concluded as follows:1.The text classification based on the graph neural network is taken as the starting point to carry out the research on the supervised task.A graph neural network based on a fusion mechanism(dubbed as graph fusion network)is proposed,which aims to solve two challenges remaining in previous graph neural networks.One is the inability to build a good text graph and adapt it using the supervised information,and the other is the inability to reason about newly emerging documents conveniently and effectively.For the first challenge,this thesis introduces two kinds of prior knowledge to construct multiple text graphs,and uses supervised information to learn and adjust them,so that they can capture more structural information and better adapt to the downstream task?For the second challenge,this thesis extracts the representation of the document from the word-level representation based on the text graphs,which makes the proposed algorithm reason about the newly-emerging text easily and effectively.Through experiments on widely adopted text classification datasets,the performance of the graph fusion network proposed in this thesis is verified.2.The sentiment analysis is taken as the starting point to study unsupervised tasks and a multi-source unsupervised sentiment analysis framework is proposed.The proposed framework can model the supervised information for multiple source domains,and then use an unsupervised measurement method to transfer knowledge from multiple source domains.The proposed algorithm is tested on the widely adopted sentiment analysis datasets,and the results proved that the performance of the proposed algorithm is better than other multi-source unsupervised domain adaptation methods,and even performs better than supervised methods in some domains.3.Automatically knowledge transferring from multiple source domains is studied.Through redesigning the adversarial training method,two multi-source unsupervised sentiment analysis frameworks based on adversarial training are proposed.The first framework uses the discriminator to automatically learn the distance between the source domain and the target domain samples,and assigns different weights to the source domain classifiers according to the learned distance.In order to make better use of the source domain classifiers,this thesis introduces a self-training method based on the first framework,so that it can annotate the unlabeled data of the target domain,which can provide more supervision information to the target domain.A large number of experiments have proved that the adversarial training method designed in this thesis can effectively assist the two frameworks to complete the knowledge transfer from multiple source domains,and make the target domain achieve excellent performance.4.The hybrid question answering that requires both remote supervision information and supervision information simultaneously is studied,and a solution that can complete the hybrid question answering is proposed.The data sources that need to be processed for hybrid question answering include text and table.Hybrid question answering is a relatively complex and comprehensive task of natural language understanding.The proposed solution firstly constructs the remote supervision information needed in the retrieval stage according to the final answer,and then completes the evidence retrieval through a pathinduced evidence retrieval algorithm? In final,a large pre-trained model is adapted to read the answer according to the semantics of the question.The solution for hybrid question answering was first proposed in this thesis and verified on the hybrid multi-hop question answering dataset: Hybrid QA.Its retrieval performance surpasses multiple baseline methods,and its reading performance is also excellent.
Keywords/Search Tags:Natural Language Understanding, Neural Network, Deep Learning, Text Classification, Question Answering
PDF Full Text Request
Related items