| Reading comprehension is to read a given document and answer a series of questions.It is a research focus in natural language processing,and also a symbol to evaluate artificial intelligence.Set against the backdrop of the BeiJing College Entrance Examinations,this paper studies the description-question answering for reading comprehension in Chinese literature documents,the main works was as follows:(1)Description-question and analysis.This paper analyzes the classification,difficulties and techniques of description-question for reading comprehension in Chinese literature documents,and defines the research priorities to answer sentence selection and sentence fusion.(2)Answer sentence selection.Firstly,this paper exploits three strategies to evaluate the semantic relevance between sentences,the strategies are sentence similarity based on HowNet,sentence similarity based on word embedding,topical distribution similarity based on LDA.The experiments on reading comprehension datasets show that the word embedding based method performance best,and F-measure reaches 49.08%.In addition,the answer sentence selection can be regarded as binary classification problem,therefor this paper uses convolutional neural network to classify sentences,and determine is it true or not.Through the training and test of annotated datasets,the results show that F-measure achieves 68.35%.(3)Sentence fusion.This paper studies the techniques of sentence fusion for description-question,and presents a method which gives consideration to sentence importance,relevancy to queries and sentence readability.Its main idea is as follows: At first,the parts,which will be fused,are chosen based on sentence separation and word salient.Then,the repeated contents are merged by word alignments.Finally,the sentences are generated based on the integer linear optimization,which utilizes dependency relations,language model and word salient.The experiments on reading comprehension datasets in college entrance examinations achieve an F-measure of 82.62%.(4)The answer sentence selection and sentence fusion pipeline approach is applied to real GaoKao datasets,after human rating,the final accuracy of this approach is 30.84%.This paper research the description-question answering for reading comprehension in Chinese literature documents,its contribution is as follows:(1)uses a variety of methods to realize answer sentence selection;(2)presents a sentence fusion method which combine dependencies with the language model based on the word importance;(3)the framework which combine answer sentence selection with sentence fusion is used to solve the description-question.The exploration of this paper lays a foundation for future study. |