Font Size: a A A

Research On The Method Of Generating Tibetan News Text By Key Information

Posted on:2024-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuFull Text:PDF
GTID:2558307079491424Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
It has always been the goal of artificial intelligence to use computer instead of human to write high-quality text,and text generation task achieve this goal.At present,deep neural network model is the main method to complete this task,is due to its powerful ability to capture features and learn knowledge,so the neural network model can be used to better complete the text generation task.But to further improve the generation effect,we need more in-depth exploration and research.Intelligent processing of Tibetan is of great significance to the cultural and economic development of Tibetan people.At present,Tibetan has had good results in word segmentation,semantic analysis and translation,but there are few research results in the direction of text generation.Therefore,this thesis uses deep neural network model to generate long texts with good quality for Tibetan language,and takes this task as an aim to explore a text generation method suitable for Tibetan.In this paper,the text generation task is studied on the basis of the pre-training model,and then considering the requirement and purpose of the news text generation,the text generation based on the key information is proposed with a new method.At last,data enhancement is combined in to this model to complete this task.The main content and important contribution of this thesis are summarized in the following four aspects.(1)133,012 Tibetan news corpus was collected and Tibetan data set T-News was constructed to provide data basis for the training model.(2)A suitable deep neural network model is selected based on the special structure and text generation target of Tibetan text,and two new pre-training models are trained on Tibetan,namely T-Transformer-XL.T-XLNet.Then,these two new pre-training models are used to complete the text generation task.Meanwhile,these two new pre-training models can also be used for other downstream tasks,which enriches the pre-training model in Tibetan.The experiment shows that these two pre-training models have excellent performance in text generation task.(3)Considering that the generation of news text has a specific topic,this thesis proposes a method to generate Tibetan text based on key information.Experiments show that this method has good performance and can accomplish this task well.(4)A method combining data enhancement to generate Tibetan text is proposed.This method can not only efficiently generate news articles with good quality and similar semantics,but also multiply the Tibetan corpus.Experiments show that this method has a good effect of text generation.All the methods proposed in this thesis are feasible and effective verified experimental results,and further promote the development of Tibetan in natural language processing.
Keywords/Search Tags:natural language processing, neural network, text generation, pre-training model
PDF Full Text Request
Related items