Font Size: a A A

Research On Sentiment Classification Technology Of Tibetan Text

Posted on:2021-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:C Z M QueFull Text:PDF
GTID:2435330620975843Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the constant growth of Tibetan web pages and digital libraries,there is an increasing number of Tibetan intellectuals interested in expressing their opinions and thoughts on things(events)on the Internet.Such opinions and thoughts often tend to be concerned with some emotional expressions.Analyzing these emotional expressions is not only beneficial to the data analysis problems in Tibetan language processing,but also applicable to public opinion monitoring,marketing strategies,and personalized customization of Tibetan question answering systems.At present,the study of text emotion classification in Chinese and English is relatively mature.However,Tibetan natural language processing started late and research on emotion classification is relatively scant.As for the emotion analysis of Tibetan texts,the pre-processing aspects of Tibetan texts are studied—e.g.,automatic recognition and word segmentation of Tibetan sentences,and syllable segmentation.Further,deep learning is used to classify the emotion of Tibetan sentences.Additionally,sentence emotion classification and dictionary(including emotion words,degree adverbs)are used to classify Tibetan short texts.Overall,the main contents and contributions of this article are as follows:(1)In response to the current needs of Tibetan emotion classification,a corpus of Tibetan emotion sentences with a scale of 15,000 was constructed,including conversations,opinions,conclusions of positive,negative,and neutral sentences in different types of Tibetan style.Based on Tibetan word segmentation and syllable segmentation and manual proofreading,two sentence-level standard emotion tag corpora of sentence segmentation and syllable segmentation are constructed.(2)Research on Tibetan emotional corpus preprocessing technology to effectively handle the training and testing of Tibetan emotional sentences,this article first proposes a Tibetan sentence boundary based on a hybrid strategy in addition to the existing Tibetan word segmentation system.The automatic recognition method solves the Tibetan sentence automatic segmentation technology.Secondly,a syllable segmentation method based on the mixed-mode is proposed based on the continuation rules of the case particle and the context.Through experiments,the accuracy of the automatic segmentation of sentences and syllables reaches99% and 99%,respectively.(3)After constructing a standard sample corpus and solving the preprocessing technique,this paper proposes a method of Tibetan sentence emotion classification based on word vectors and two-way LSTM.By analyzing and filtering the deactivated word list of Tibetan sentences,studying the emotion features and distributions in different types of Tibetan sentences,and using word vectors and two-way LSTM model technology,trained a emotion classification model suitable for Tibetan sentences and realized a sentence-level Tibetan emotion automatic classification system.Through experiments,the accuracy of different types of emotion sentences reaches 89%,90%,and 89%.(4)Implemented a Tibetan text emotion classification system.Based on the emotion classification of Tibetan sentences,this paper studies the classification of Tibetan emotion text.Based on sentence-level emotion classification technology and dictionary(including emotion words and degree adverbs)weighting technology,emotion classification was performed on Tibetan paragraph-level text,and a Tibetan emotion classification system was implemented.
Keywords/Search Tags:Tibetan text preprocessing, neutral network, word vector, Tibetan sentence and text emotion analysis
PDF Full Text Request
Related items