| Answer selection is one of the key subtasks of question answering,and it is also an important but challenging task in natural language processing.Existing answer selection models are mainly convolutional neural networks(CNN)or recurrent neural networks(RNN)based deep models.However,both CNN and RNN have their limitation in natural language processing.Self-attention,which is a recently proposed network,has proved its promising performance in a wide range of tasks,but it hasn't been used in answer selection.In this thesis,we apply self-attention to answer selection.To the best of our knowledge,this is the first time to apply self-attention to answer selection.Meanwhile,we design new self-attention models according to the characteristics of answer selection.The three main innovative contributions of our paper can be outlined as follows:Existing self-attention models have the problem that local and global information cannot be well distinguished.We propose a novel self-attention method,called gated group self-attention(GGSA),for tackling this problem.More specifically,GGSA explicitly distinguishes the local information and global information through group self-attention and combines these two kinds of information through a gate mechanism.Based on GGSA,a novel question-answer interaction mechanism is also proposed through a residual structure,which can make the words in answers take question information into account.Experimental results on two popular answer selection datasets show that GGSA can outperform existing models to achieve state-of-the-art performance.Existing answer selection models are limited by the problem that background information and knowledge beyond the context are lacking in answer selection.We propose a BERT based answer selection(BERT-AS)model for tackling this problem.Through training on a large language corpus,abundant common knowledge and linguistic phenomena can be encoded into the parameters of BERT,which can effectively enhance the performance of answer selection models.Experimental results on three popular answer selection datasets show that BERT-AS can outperform existing models to achieve state-of-the-art performance.Existing interaction based answer selection models meet either speed problem or memory cost problem when they are deployed for online prediction.We propose a hashing based answer selection(HAS)framework for tackling this problem.When the self-attention model BERT is adopted as the encoder in HAS,we construct an answer selection model based on hashing and self-attention.HAS learns a binary matrix representation for each answer by hashing strategy,which can dramatically reduce the memory cost for storing them.By storing the(binary)matrix representations of an-swers in the memory,HAS can avoid recalculation for them during online prediction.Subsequently,complex self-attention encoders like BERT and GPT-2 can be adopted in HAS,but the online prediction is still fast with a low memory cost.Experimental results on two popular answer selection datasets show that HAS can outperform existing models to achieve state-of-the-art performance. |