Font Size: a A A

Research On The Method Of Summarizing Public Opinion Opinions On Microblogs Involved In Criminal Case

Posted on:2023-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:M X MaFull Text:PDF
GTID:2567306797473344Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet information age,microblog has become the main platform for the exposure of social hot events and the official notification of legal cases,which arouses heated discussions among citizens and also generates a large number of potentially valuable comments.Therefore,it is of great significance to classify the microblog comments involved in the case according to the opinion objects,and then generate a comprehensive and accurate summary from the comment clusters of different opinion objects,which is of great significance for the analysis of public opinion and emergency disposal of the case.However,the summary of comments and opinions on the microblog involved in the case is a task of specific judicial field.The expression of comments is arbitrary and non-standard,and it is difficult to extract effective features.There is also a lack of comment-summary corpus pairs,resulting in the generated summaries with ambiguous reference,noise,and inconsistent topics.According to the characteristics of the microblog comments involved in the case,this paper effectively integrates the guidance information of the microblog content,and studies the object classification of the microblog comments involved in the case and the generation task of the comment summary based on the attention mechanism,contrastive learning,auto-encoder framework and other unsupervised methods.The thesis mainly completed the following research work:(1)Constructed a corpus of opinion classification and summary of comments on microblog involved in the case.A large number of microblog comment corpus involved in the case is the basis of the research content of this paper.Therefore,we use microblog crawler technology to crawl a total of 120,000 microblog opinions and object classification comment corpus involved in the case from microblog,and construct 100 pairs of comment-abstract corpus.Yes,to provide data preparation for subsequent experimental models.(2)Propose a self-supervised opinion object classification of microblog’s comments involved in the case based on contrastive learningAiming at the problem that microblog comments lack obvious opinion object nouns,and traditional methods are difficult to extract effective emotional features,a self-supervised method of opinion objects classification of microblog comments involved in cases based on contrastive learning is proposed.The multi-head attention global information enhancement module is used to capture the key emotional fragments in the review,and then a self-supervised method of contrastive learning is introduced to enhance the keyword features related to the opinion object,which effectively solves the problem of the scarcity of classification sample labels.Finally,the comment texts are divided into four categories: judicial organs,parties,charges,and others.The experimental results on the constructed dataset of 120,000 microblog comments involved in the case increased by 2.2% compared with the macro average F1 value of the existing benchmark model,which proves that the global information enhancement module can effectively capture the comment context information.The keyword co-occurrence relationship of opinion objects improves the classification effect.(3)Propose an unsupervised method of summarizing the opinion objects of the microblog comments involved in the case that integrates the textDue to the lack of comment-summary corpus of microblog cases and the problems of non-standardized,inaccurate,fragmented,redundant and noisy text expression of microblog opinion object comments,the content of the microblog contains key information of the case,which can assist in filtering out irrelevant comments and generating comment summaries related to the topic.In order to better characterize the comments and the content of the text,back-propagation of the content and comment features using auto-encoder is used to train the encoded comments to infinitely approximate the input text vector representation.In the summarization module,the text and comment representations are fused by an attention mechanism,the similarity loss between the summarization and the fused text comment is calculated to make the summarization infinitely close to the microblog comment finally.In the encoder side,the content and comments are encoded by BI-LSTM,and then the comments related to the text are filtered out by the information filtering module and decoded into a summary after direct stitching.It is demonstrated that this method improves ROUGE-1 by 0.47,ROUGE-2 by 0.51 and ROUGE-L by 0.36 on the summary dataset of the opinion object comments on microblog involved in the case.(4)Prototype system for the microblog comment summary involved in the caseThe prototype system is mainly composed of four parts: data acquisition module,functional model layer,WEB layer and model interface layer.The system realizes the automatic summarization function of the comment clusters of different opinion objects by import opinion object classification and summary model,and displays the comment text and summary interface.
Keywords/Search Tags:summary of comments on microblog involved in the case, opinion object classification, multi-head attention mechanism, comparative learning, auto-encoder
PDF Full Text Request
Related items