Font Size: a A A

Research Of Semantic Retrieval And Classified Portrait Of Legal Judgments Documents

Posted on:2020-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhuFull Text:PDF
GTID:2416330596481798Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the construction of China's Guidance Case System and more ways to obtain legal judicial documents,the method of referencing citing cases for pending cases has been widely used in litigation.However,at present,the existing judicial documents retrieval system can only search from the shallow literal level of the judicial documents by largely ignoring the semantic relevance of the case and wasting the existing legal judicial documents resources.At the same time,there is a lack of semantic classification in the existing classification of legal judicial documents.To overcome the shortcomings of semantic mining on current legal judgment documents,the concept of NLP + law is proposed.Firstly,Using distributed clustering framework Mapreduce and data warehouse Hive as support,the word embeddings representing semantics are gained through training large-scale legal judgment documents corpus with Skip-gram model based on negative sampling.Then,sentence embedding representing the context relations are gained with Random Walk model and smooth inverse frequency(SIF).After these steps,legal judicial documents are expressed as distributed high-dimensional vectors and the cosine distance between vectors is used to measure the similarity between legal judicial documents.Top 10 with the highest similarity cases are selected as the citing cases for pending cases at the semantic level.Experiments show that similar case queries based on sentence vectors have a high degree of semantic matching.Secondly,by constructing the vector library of legal judicial documents,the k-means algorithm and birch algorithm based on sentence vector are used to cluster.After using contour coefficient to compare the two,experiments show that the birch algorithm has better clustering effect and divides the existing legal judicial documents into six clusters.Then the keyword lists as the label of each cluster are extracted based on the word embeddings so as to construct the semantic classification portrait of legal judicial documents.The innovations of the paper are summarized as follows:?.Unlike the traditional weighted averaging and complex neural network method of word embeddings,the Random Walk model and the smooth inverse frequency(SIF)method are used to construct sentence embeddings,which optimizes traditional method that ignores the semantics and avoids the complexity of the construction of neural network method and the time-consuming of training.?.Before extracting keywords from cluster classes as labels,clustering based on sentence embeddings and extracting keyword based on word embeddings,which improve the result of mining the semantics.?.As a try to further study natural language processing(NLP)in the field of law,the application scenario of NLP+law is proposed.In the process of calculating text similarity,theactual application scenario of affixing law is allocated with different weights,and the semantic similarity query focusing on the elements of various legal adjudicative documents is proposed,which improves the application scenario of NLP+law.NLP provides the practicability of citing cases in the legal field,and constructs a new set of semantic classification portraits of legal judicial documents,which provides a new way to use abundant legal adjudication documents resources thoroughly.Generally speaking,the use of neural networks to model the word embeddings of legal judicial documents and the use of Random Walk model and SIF to construct sentence embeddings can better extract the semantic information of the text and improve the effect of text similarity query and clustering analysis in the later period.It also can help the application of natural language processing in the legal field better.
Keywords/Search Tags:word embeddings, sentence embeddings, similarity, clustering, semantic, legal judgment documents, intelligent law
PDF Full Text Request
Related items