| With the continuous development of society and the increasingly strict standards of legal practitioners,the growth of the number of legal practitioners is far slower than the increase of the number of criminal cases,and the judicial organs are facing the difficulty of "more cases and fewer people";and the discretion of judges will be affected by their level of knowledge,work experience and other subjective factors,which may cause the situation of "different judgments in the same case" and affect the judicial fairness.In order to solve the above problems,this thesis takes the criminal judgment documents as the input,uses natural language processing technology,and decomposes the automatic analysis of the case into four sub tasks:recommendation of relevant laws,prediction of the defendant’s charge,prediction of the defendant’s sentence and recommendation of similar cases.It is hoped that this method can help to improve the efficiency of judges in the construction of judicial intelligence,and balance the relationship between the number of criminal cases,the manpower of judicial organs and the work of judicial judgment.Considering that the above four sub tasks are not completely independent,this thesis uses the multi task learning method for joint modeling,and uses the sharing factors and correlation between them to share the judgment document preprocessing,semantic coding and other parts,so as to improve the generalization ability of the case analysis method in this thesis.Specifically,considering that the defendant may touch more than one law or be charged with more than one crime,this thesis regards it as two multi label classification tasks,and shares the input layer,embedding layer and semantic coding layer to achieve the effect of multi task learning.For the task of sentence prediction,this thesis divides the interval of sentence in the data set according to the characteristics of data distribution,and uses the method of text classification to predict the interval of sentence.On the basis of sharing the input layer,embedding layer and semantic coding layer with the above tasks,this thesis extracts and integrates the rule information according to the sentencing guidance,so as to increase the interpretability of sentence prediction.For the recommendation of similar cases,due to the lack of marked data,this thesis constructs a database of similar cases from the dimension of articles and charges.When inputting criminal facts,the similar case database is located by relying on the results of law recommendation and accusation prediction.The semantic coding model trained in the above task is used to encode the case text,and then the similarity between them is calculated,and the similarity ranking and similar case recommendation are carried out accordingly.Based on the CAIL2018 data set,this thesis downloads the criminal judgment documents from the China judgments online for supplement,and carries out experiments on this data set,so as to reduce the impact of uneven distribution of labels to a certain extent.The experimental results show that,compared with the independent training of each subtask,using multi task learning mode and sharing part of the structure can effectively improve the effect of the case analysis method. |