Font Size: a A A

The Design And Implementation Of Big Data Quality Inspection Platform For Judgment Documents

Posted on:2020-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:H LianFull Text:PDF
GTID:2416330575452525Subject:Engineering
Abstract/Summary:PDF Full Text Request
Under the background of China's Wisdom Court,the amount of judicial data that can be stored and processed by computers has grown rapidly.People have recognized that judicial data contains enormous social and business value.As the key data in the trial and execution process,the judgment document not only integrates the case information in the judicial business process,but also provides a data foundation for new judicial services such as judicial case retrieval,case recommendation,and penalty prediction.The data quality determines the upper limit of the effect.Only when the data quality is up to standard can the data value be fully utilized.The court judgment document data is stored in XML format,and the cases and trial information are described in Chinese.The court's current document data quality detection methods only verify content compliance,which lacks semantic analysis of context,and doesn't detect data quality from information level.In view of that,this thesis proposed a quality inspection system for judgement documents,which is divid-ed into the structured content quality and the unstructured semantic quality.The con-tent quality metrics consist of nine dimensions,which combine objective information theory and rough set theory,including suitability,broadness,granularity,coverage,de-lay,persistence,inclusiveness,richness and authenticity;The semantic quality adopts the natural language processing methods,the case description is analyzed by the de-pendency syntax analysis and the semantic role labeling.Eight semantic features are constructed,and the semantic contribution model is proposed to measure the semantic quality.In view of the huge amount of judgement documents,this thesis uses Hadoop component to design and implement the platform for quality inspection of judgmen-t documents.The platform has four modules:data interaction,documents analysis,quality inspection and privilege management,which can provide distributed storage and data quality inspection services in big data environment.The quality inspection system proposed in this thesis comprehensively measures the quality of the judgment documents.The big data platform has realized the quality inspection service under the continuous growth of data.Results from the thesis have been submitted to the Supreme People's Information Service Center as a proposal.
Keywords/Search Tags:Judgement Documents, Data Quality, Natural Language Processing, Big Data, Hadoop
PDF Full Text Request
Related items