| With the development of democracy and the rule of law and ethnic policy in China,more and more consulting services are provided by ethnic minorities to discipline inspection and supervision institutions.Moreover,the problems of mass consultation are mostly repetitive,which makes the workload of discipline inspection and supervision departments increase.In addition,Mongolian is one of the important minority languages in China,and our country has a large number of people who use Mongolian.The Mongolian automatic question and answer system can provide the masses with more convenient and efficient consulting services.Therefore,the automatic question answering system based on Mongolian has important research and application value.However,the research of Mongolian automatic question answering is still in its infancy.The related work on Mongolian are very limited yet,which restricts the development of Mongolian informatization.Therefore,this thesis is oriented to the field of discipline inspection and supervision,and researches on the two modules of question intention recognition and question answer matching in the Mongolian automatic question answering system.The main work is as follows.Firstly,the corpus of discipline inspection and supervision is constructed.At present,there is no open Mongolian question answering data set in the field of discipline inspection and supervision.Before the experiment,corpus and question and answer pairs were collected.In the process of building the corpus,this thesis uses Text Rank combined with Word2 Vec to build the domain word vocabulary database.Then this thesis uses the Scrapy framework to search the domain words in the relevant websites to capture the question answering data,and constructs the discipline inspection and supervision domain corpus.Secondly,for the question intention recognition,according to the characteristics of Mongolian word formation,this thesis proposes a Mongolian Mon-BERT pretraining model.At the same time,the user’s intention is analyzed from the perspective of question category in the question answering system.This thesis proposes a question intention recognition model based on the Mon-BERT Mongolian pretraining model combined with the bidirectional long and short-term memory neural recurrent network(Bi LSTM).The experimental results show that the Mon-BERTBi LSTM model improves the F1 value of the baseline model by 5.54% in the task of question intention recognition.Finally,for the question and answer matching module,this thesis proposes the Mon-BERTQA question and answer matching model.Model training is divided into pre-training stage and fine-tuning stage.The upper and lower sentences are input in the pre-training stage and exchanged the order during the training process by considering the symmetry characteristics to improve the performance of the model.In the fine-tuning stage,the question and answer data are used in the discipline inspection and supervision field to fine-tune to determine whether the question and the answer match.Experiments show that the Mon-BERTQA question answering matching model has increased the F1 value of 4.95% compared with the baseline model. |