Font Size: a A A

The Text Similarity Analysis Research On Chinese Judgement Document

Posted on:2018-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2506305156475734Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,The Supreme People’s Court has vigorously promoted the informatization of Chinese courts with the ideas of "big data,big pattern and big service".Based on that,a series of researches and analyses of judicial big data have not only made brilliant achievements,but also faced many problems and challenges to be solved.Chinese judgement documents,as the records of the process and result of trial,which are converged in the process of informatization,are a valuable resource of trial process.In such an environment,some ideas about Chinese courts can reach automation or get better result through the research of machine learning,such as similar documents recommendation,workload evaluation on similarity of judgement documents and prediction of possible laws.In trying to achieve all above mentioned,in face of the specific of Chinese judgement document and the difficulty of existing text similarity approaches,a specific approach is called for Chinese judgement document.The thesis proposes a topic model based approach to measure the text similarity of Chinese judgement document,which is based on semantic analysis,and possesses generality,high automation and high accuracy.Besides,when consider the laws as the similarity evaluation basis,we introduce Labeled Latent Dirichlet Allocation(LLDA)to the general approach and proposes an assumption of word order based approach.Meanwhile,In the process of improvement,we proposed an extension approach based on LLDA,aimed at filtering the labeled topics which are obscure and hard to be represented by LLDA model.At last,through a series of experiments and their results,we verify the reasonability of decisions in our approaches,demonstrate the applicability of our approaches and analyze the advantages,disadvantages and future work of our research.
Keywords/Search Tags:Chinese Court, Judgement Document, Machine Learning, Nature Language Process, Topic Model, Text Similarity, Latent Dirichlet Allocation, Improvement
PDF Full Text Request
Related items