Optimization Of Automatic Summarization Algorithm For Long Chinese Text And Its Application In Cloud Classroom

Posted on:2023-11-16

Degree:Master

Type:Thesis

Country:China

Candidate:M Lei

Full Text:PDF

GTID:2558306914971949

Subject:Computer technology

Abstract/Summary:

With the rapid development of the information age,more and more information is emerging in front of the public,bringing convenience to people while also consuming too much energy to filter the content they need.However,the automatic text summarization technique can generate text summaries by compressing and extracting key information from the original text,which greatly accelerates the speed of retrieving information and reduces reading costs.Thus,it is of considerable research interest.Although the automatic text summarization technique has made good progress so far,it is limited by algorithmic model and hardware conditions.This technique is restricted to short text scenarios,and there are still many problems in long text applications.Nowadays,the development of Chinese long-text summarization is limited by the following three aspects.Generative summarization suffers from the problem that model training is difficult to converge in long text scenarios,and is prone to semantic errors and repetitions.In addition,traditional word vector models cannot deeply understand the semantics of long Chinese texts,and only the literal semantics of sentences can be obtained.Most importantly,the automatic Chinese long-text digest corpus is insufficient.Most of the current Chinese digest corpora are used for short text digests,which are difficult to train Chinese long text summarization models with supervision.Therefore,this paper focuses on analyzing the graph model algorithm LexRank.This paper proposes an automatic digest extraction algorithm BLSummary for long Chinese texts,which can solve the problems of LexRank algorithm such as insufficient sentence semantic mining and the inability to solve multiple meanings of a word.This proposed algorithm uses the BERT model to mine the deep semantics of sentences and combines with the LDA topic model to extract keywords,which is an improvement to the LexRank algorithm.Finally,experiments on LexRank and its related algorithms using ROUGE evaluation metrics were conducted.The results show that BLSummary has 8%,5%,and 8%improvement in ROUGE-1 value,ROUGE-2 value,and ROUGE-L value,respectively,over the traditional LexRank algorithm in the publicly available education news dataset.At present,there are a lot of long articles,notes,and other contents in the cloud classroom system,which makes it difficult for users to obtain valuable information quickly.In addition,the existing automatic digest algorithms are more oriented to short text scenarios,which are not effective for long text summaries.Therefore,this paper designs and implements a cloud classroom system that combines video content and text content.The system utilizes the BLSummary algorithm to generate automatic summaries of articles,notes,etc.in the system,thus speeding up the speed of accessing information and improving the user experience.

Keywords/Search Tags:

automatic text summarization, long Chinese text, BERT, LexRank, cloud classroom

Related items

1	Research On Automatic Text Summarization Algorithm For Chinese Long Text
2	Research On Automatic Generation Method Of Chinese Text Summarization
3	Research On Automatic Text Summarization In Chinese
4	Chinese Text Summarization Technology Based On Improved BERT Pre-training Model And Graph Neural Network
5	Research On Automatic Text Summarization Algorithm For Chinese And English Long Text
6	Research On Automatic Text Abstract System Based On Chinese Long Text
7	Research On Content Semantic Analysis Based Text Summarization Methods
8	Research On Chinese Text Summarization Technology Based On BERT-KA-PGN Model
9	Research And Implementation Of Text Summarization Technology Based On Semantic Understanding
10	Research On Chinese-Vietnamese Cross-language Text Summarization Method Based On Transformer Structur