Font Size: a A A

Legal Documents Recommendation System Based On Document Similarity

Posted on:2019-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:P Y WuFull Text:PDF
GTID:2416330548970725Subject:Engineering
Abstract/Summary:PDF Full Text Request
Smart procuratorial work is based on the procuratorial organs' network,digitalization,and informationization,and realizes the comprehensive intelligence of procuratorial work,all-round wisdom service,all-round wisdom management and all wisdom support.The current society has moved from the information era to the great transformation of the AI era.We should promote the deep application of technologies such as cloud computing,big data,and artificial intelligence in prosecution systems,and achieve a high degree of integration between prosecution work and information technology.This serves both the public and the internal judicial personnel.It also enhances the effectiveness of safeguarding fairness and justice.The implementation of wisdom and prosecution puts the judiciary and the judiciary in the first place.After summarizing and analyzing the overall development trend of today's wisdom and prosecution system and the emerging application of the Internet,this paper starts from the characteristics of the legal document recommendation system,breaks the traditional prosecutor's manual screening of legal documents,and turns into intelligent recommendation related laws.The paperwork can be used to facilitate the prosecutors'handling of cases,such as pushing cases and supporting sentencing.The implementation of the legal document recommendation of this article is divided into four steps.The first is the acquisition of legal instruments.The data source of this article is the legal document of the People's Procuratorate Case Information Disclosure Network.It analyzes the web page structure of the People's Procuratorate Case Information Disclosure Network,crawls legal documents,and obtains data sets.The second is the segmentation of legal instruments.Pre-processing of the crawled data set,such as desensitization processing and noise processing,then adding an extended dictionary and deactivated dictionary in the field of legal documents,and using a combination of Hidden Markov Model and Viterbi algorithm to process the word segmentation of legal documents,The formation of the word after the data set.Third,construct a data model based on document similarity.All post-segmentation legal documents are entered into a document as a data set together with Wikipedia's Chinese data,and word2vec is used to train the word vector of this document to form a word vector model.Then each time 100 pieces of legal documents are selected as the experimental set,the experiment set is based on the word vector model of the data set and the text matrix is calculated by the WMD algorithm.Fourth,the legal instrument is given to the prosecutor for recommendation.Top-k and key-value methods are used to find out from the text matrix the legal documents that are similar to the top n legal documents of the prosecutor's input.Finally,through experimental verification,we can conclude that this method is helpful for class push,assisted sentencing and prosecutors to obtain legal documents of interest.
Keywords/Search Tags:web crawler, participle, word2vec, word vector, legal documents recommended
PDF Full Text Request
Related items