| Nowadays,the deep integration of artificial intelligence and the judicial field,the empowerment of science and technology has allowed the "rooting" of smart justice,and made more solid steps in the modernization of the judicial system and judicial capabilities,which has important strategic significance for the country’s legal system construction.The task of identifying the focus of disputes is an important link in the process of judicial adjudication that drives case trials and dispute resolution.It aims to accurately extract the focus of disputes between the defense and the defense,and promote the correct application of the law.However,the explosive growth of judgment documents and the small number of cases have put forward high requirements for the work efficiency of legal practitioners.Aiming at this problem,this paper uses natural language processing technology to transform the task of identifying the focus of disputes into a multi-label text classification task: the judgment document is the text to be classified,and the focus of dispute is the label.Design and implement a dispute focus identification system to improve the quality and efficiency of dispute focus identification.Due to the long text of the referee document,it is difficult to fully extract the text features,and the number of disputes in this article is as many as hundreds,which brings the challenges of large potential output space and difficult model learning.In response to the above problems,based on the technology of multi-label text classification,this paper carried out the following work:(1)Statistical analysis and preprocessing operations are performed on long text datasets in the legal field.It mainly includes data acquisition,data distribution statistics,data splitting,data cleaning,and data vectorization representation.Through the statistical analysis and processing of the data set,it can effectively guide the design of the neural network structure and improve the training efficiency and accuracy of the model.(2)A dispute focus identification algorithm that integrates twin Bert and GAT networks is proposed.First of all,in view of the problem that the length of referee documents far exceeds the input length limit of most neural networks,resulting in information loss,a twin Bert network is designed to receive the content of referee documents almost completely and fully extract long text features.Secondly,aiming at the potential association information between hundreds of controversial focus labels,the graph attention network is used to learn the intrinsic correlation between controversial focuses.Finally,the interaction between the focus of dispute and the semantic information of the judgment document enhances the feature representation of the specific dispute focus of the judgment document.Compared with the advanced methods in this field on the legal dataset,the values of Micro-F1 and Macro-F1 are increased by 1.53% and0.94% respectively.(3)Design and implement a dispute focus identification system for referee documents based on multi-label text classification.Analyze the system requirements,design and implement different functional modules,build a system structure based on the BS architecture,use the Flask framework,and My SQL data management tools to facilitate data retrieval.Finally,through testing the application of the system,the results show that the system can quickly respond to the content of the input documents and give the identification results of the focus of disputes. |