Font Size: a A A

Research And Implementation Of Fact-based Case Big Data Query Technology

Posted on:2022-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:W K CaoFull Text:PDF
GTID:2516306722488834Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous improvement of legal awareness,people's demands for judicial fairness have become increasingly urgent.However,due to the existence of objective factors such as the increasing workload of the judicial system in our country,the uneven quality of the judges,and the differences between different regions,the phenomenon of "different judgments for the similar cases" occurs from time to time,which has a bad influence on judicial credibility.Therefore,building a complete search system for similar cases is of great significance to eliminate the phenomenon of "different judgments for the similar cases" and maintain judicial fairness and justice.Intuitively speaking,to judge whether two cases are similar cases,the key point is whether they have similar case facts.However,the massive data scale and the inherent irregularity of case facts described in natural language make it extremely difficult to design an efficient and accurate case retrieval system.In this regard,this thesis proposes a fact-based case big data query solution,which mainly includes the following research contents:(1)Key information extraction method for case facts.Due to the large amount of redundant content in the description of the case facts,the key information is ambiguous and difficult to extract,which greatly limits the ability of existing deep models.Due to the lack of relevant research data sets,this thesis first collects different types of cases to construct research objects;then,proposes a method for extracting key information of case facts based on the improved TF-IDF algorithm.Experiments show that this method has high accuracy(75.1%)in extracting key facts information,this part of the research lays the foundation for the subsequent exploration of similar case matching models.(2)Similar case matching model based on the multi-fusion siamese Ro BERTa.The fact of a case is usually an unstructured natural language description of the key events in the development of a case,containing a large number of common words,legal words and a large number of redundant information,which makes it very challenging to accurately measure the similarity of two case facts.To this end,this thesis proposes a similar case matching model based on the multi-fusion siamese Ro BERTa.This method fuses three siamese Ro BERTa models with significant differences in configuration parameters for fusion to achieve the effect of model complementation and improve the accuracy of matching similar cases.Experiments show that the model has achieved high accuracy(85.1%)in matching similar cases,laying a technical foundation for the construction of a complete category case retrieval system.(3)Design and implementation of case big data query system.Existing legal case retrieval systems are mostly based on key fields(such as plaintiff,defendant,cause of the case,etc.)for retrieval,which cannot meet the requirements of category case retrieval.This thesis starts from the urgent need for retrieval of similar cases in reality,and designs and implements a case big data query system based on the similar case matching model proposed in this thesis.The system uses a B/S structure,and the underlying database uses HBase to store a large number of legal case documents,and is optimized in conjunction with Elastic Search.This system can not only effectively store and manage large-scale case data,but also help users quickly and accurately find similar cases from a large number of existing cases,and meet people's needs for retrieval of similar cases.
Keywords/Search Tags:category case retrieval, similar case matching, big data retrieval of cases, key information extraction
PDF Full Text Request
Related items