| Machine Reading Comprehension(MRC)is an important and challenging task in the field of natural language processing,which requires machines to read and understand the semantic content of natural language contexts to answer questions.With the rapid development of deep learning techniques and the publication of various datasets related to machine reading comprehension,MRC has become one of the most enthusiastic research directions for domestic and international scholars.For example,in the judicial question and answer system,the judicial question and answer system constructed by MRC technology can provide accurate and timely legal consultation services to users.Traditional MRC models mostly rely on information from a single paragraph to answer the question,but in practice they often need to aggregate information from multiple paragraphs and perform multi-hop reasoning to obtain an accurate answer.To address this problem,multi-hop MRC tasks have been proposed in academia.multihop MRC is a reading comprehension task that requires a machine to perform multiple hop reasoning across multiple paragraphs to obtain the answer.Although existing multi-hop MRC models have achieved some results,they still have some drawbacks:firstly,many existing Multi-hop MRC paragraph selection models not only ignores the multi-hop dependencies between paragraphs,but also does not take into account the strength of the information association between the question and each paragraph,resulting in distracting information that introduces distractions in downstream reading comprehension;Secondly,the existing multi-hop MRC models based on graph neural networks are not rich enough in the types of connected edges of graph nodes to construct graph nodes,which fail to fully integrate the interactive information between graph nodes;Finally,synchronous message transmission is performed when updating nodes through graph neural networks,which does not take into account that the semantic relationships of nodes at different levels of granularity have different priorities,resulting in the inference capability and interpretability of the model being insufficient.To address the above issues,the following work was carried out in this paper:(1)A retrieval model based on paragraph pairwise rank to learn is proposed.Firstly,the RoBERTa pre-training model is used as the encoding module for question and paragraphs,using a multi-headed self-attentive mechanism that can enhance multi-hop reasoning between paragraphs and capture the semantic information between questions and each paragraph.Secondly,a paragraph rank mechanism is constructed to judge whether a paragraph is a supporting paragraph and score the paragraphs containing the span of answers,the scoring of each paragraph is compared with that of all other paragraphs to obtain the paragraph pair relation label between all paragraphs.Finally,the similarity score of each paragraph pair is calculated to predict the relevance of the question to each paragraph.The experimental results on the Distractor setting of the open source dataset HotpotQA show that the proposed paragraph retrieval model can significantly improve the retrieval of supported paragraphs compared to existing paragraph retrieval models.(2)A multi-granularity stepwise inference graph attention network is proposed.A retrieval model based on paragraph pairwise rank to learn selects the set of supporting paragraphs on which subsequent multi-hop inference is performed.Firstly,multigranularity dynamic graph(selected entities,sentence nodes and seven types of concatenated edges)is constructed,and semantic relationship groups at three levels of granularity are defined,including entity-entity,entity-sentence and sentence-sentence.Secondly,RoBERTa and bi-directional attention a are used to perform the problem and supporting paragraph sets are jointly encoded.A step-by-step messaging approach based on semantic relation groups at different levels of granularity(entityentity? entity-sentence? sentence-sentence)is proposed to update graph nodes.Finally,a multi-task learning approach is used to achieve answer span prediction,evidence-supported sentence prediction where the answer is located,and answer type prediction.The experimental results show that on the Distractor setting of the open source HotpotQA dataset,the model improves 23.83%,23.87%and 34.22%respectively in the three metrics of answer F1,support sentence F1 and joint F1 compared to the baseline model,and also has better performance compared to existing mainstream models.(3)novel judicial text machine reading comprehension based on multi-grained step-by-step inference graph attention network is proposed.To address the problem of insufficient knowledge inference and interpretability of existing machine reading comprehension models in the Chinese judicial domain,the research results of this paper are applied to machine reading comprehension in the Chinese judicial domain.The experiments are conducted on the CAIL 2020 dataset in the Chinese judicial domain,which is highly specialized and has complex relationships between characters and events,and compared with existing models.The experimental results prove that the method of this paper can effectively improve the accuracy and efficiency of machine reading comprehension of judicial texts,and provide more scientific and accurate support for courts to obtain case information and judgments. |