Font Size: a A A

Based On The Application Of The Improved TF-IDF Algorithm In The Search Of Judicial Judgment Documents

Posted on:2018-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y DingFull Text:PDF
GTID:2436330602459377Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Driven by the rapid development of modern Internet technology and further promotion of judicial publicity,the judgment document stack which takes the internet as core information carrier provides our country's judiciary construction with a great deal of document resources on judicial decisions.With the explosive increasing of judicial judgment documents,it becomes more and more difficult to search and acquire the judgment documents needed in a rapid,accurate and effective way.With regard to a few present judgment document stacks,whether it's judgment document network of china's court system or Internet company's self-built judgment document stacks such as "the magic weapon of Peking University" and "no litigation case",although they meet people's need of acquiring judicial judgment document resources in a degree,they haven't done well in providing good service on raising the accuracy of research,identifying the users' potential searching intentions as well as mining the information that the users really need.Therefore,based on the existing studies on searching patterns of judgment document stacks,this paper makes a deeper analysis on the features of judicial judgment documents and mainly discusses the issues of data collection and keyword extraction of judgment stacks.Its main content is:based on a series of technologies such as grabbing documents,analyzing the documents' content,mining judicial judgment documents and extracting keywords,it makes a demand analysis on the functions of judgment document searching system and designs an overall framework and component modules for judgment document searching system.In addition,taking advantage of the improved technology of TF-IDF,the paper not only realizes the keyword extraction of judgment documents,but also provides the system's realization process and test data results.The work of this thesis mainly shows in two following aspects:1.through integrating the Web crawler,text classification and clustering and index technology,the thesis designs a mining framework of judicial documents;2.On the basis of the traditional TF-IDF algorithm,combined with the characteristics of the judicial documents,an improved TF-IDF algorithm is proposed,which can improve the accuracy of the feature extraction and keyword extraction.Based on the design of the system of judicial documents retrieval,our users will search more faster and more accurate as needed?At the same time,they will get a better experience.
Keywords/Search Tags:Web crawler, keyword extraction, TF-IDF algorithm, the system of judicial documents retrieval
PDF Full Text Request
Related items