Research On Tibetan Sentiment Analysis System For Social Media

Posted on:2022-12-31

Degree:Master

Type:Thesis

Country:China

Candidate:T T Zhang

Full Text:PDF

GTID:2505306746451964

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the continuous rise and development of the Internet,mainstream platforms such as Weibo and We Chat all support the display of Tibetan language.Tibetan opinion information is growing rapidly on the Internet.A large number of Tibetan social media short texts have become an important part of Internet users’ opinion information.The analysis and processing of Tibetan viewpoints plays an important role in strengthening network security management and promoting scientific government decision-making.The construction of Tibetan corpus resources is relatively lagging behind,which has led to many challenges in Tibetan text research.In order to improve the accuracy of Tibetan sentiment classification,this thesis proposes a Tibetan-Chinese cross-language sentiment analysis model.With the help of rich Chinese corpus resources,the knowledge correlation between Tibetan and Chinese bilinguals is constructed,and the cross-language sentiment classification technology is used to realize the sharing of Tibetan and Chinese characteristic resources.In this way,the technical problems caused by the lack of Tibetan text resources can be solved to a certain extent.The main work of this thesis is as follows:First,build a Tibetan-Chinese bilingual sentiment database based on short texts on social media.The short texts of comments in Tibetan and Chinese languages on social media platforms are used as raw data,and preprocessing operations such as cleaning,removing stop words,tagging,and word segmentation are carried out on the corpus,and they are standardized and stored in the database.Second,a collaborative training algorithm is introduced into the Tibetan-Chinese cross-language sentiment classification task,and a cross-language sentiment classification model based on semi-supervised collaborative training is constructed.The balanced Tibetan-Chinese bilingual dataset is regarded as two different views for bilingual collaborative training,and the problem of lack of emotional resources and insufficient labeled samples in Tibetan is solved with the help of abundant labeled data in Chinese.The experimental results show that the use of collaborative training algorithm can enhance the learning ability of Tibetan sentiment classifier for unlabeled samples.Third,introduce adversarial network to improve the effect of Tibetan-Chinese cross-language sentiment classification.Using Chinese-Tibetan bilingual word vectors to map the two languages to the same shared space,use the language adversarial network to learn the joint features of Chinese and Tibetan,share the emotional knowledge of Chinese and Tibetan,and build a Tibetan-Chinese cross-language sentiment classification model based on adversarial network.,in the case of a small number of Tibetan emotional annotation sets,it can achieve better results.Fourth,a Tibetan-Chinese cross-language sentiment analysis algorithm based on an end-to-end method is proposed.Based on adversarial network,the model is improved with the end-to-end method,and the unsupervised end-to-end strategy is adopted to model sentences in Tibetan and Chinese language pairs.The gap between the two languages is eliminated by calculating the probability of language pairs,and the problem of insufficient annotated corpus is solved.

Keywords/Search Tags:

Tibetan-Chinese bilingualism, sentiment classification, cross-language, collaborative training, adversarial network

PDF Full Text Request

Related items

1	Research On Chinese-Vietnamese Cross-language Object-level Sentiment Analysis Method For Social Media Text
2	Design And Implementation Of Korean-Chinese Cross-Language Text Classification Based On Multi-Layer Semantic Feature Alignment
3	Research On Sentiment Classification Technology Of Tibetan Text
4	The Study On The Representation Of Bilingual Syntactic Representation Of Tibetan College Students By Syntax
5	Research On Sentiment Analysis Model Of Movie Reviews Based On Further Pre-training And Feature Fusion
6	Convolutional Neural Network For Sentiment Classification Based On Sentiment Special Word Embeddings
7	Research On Cross-language Sentiment Analysis Method For Chinese And Vietnamese Social Media Text
8	Research On Sentiment Classification Method Of Japanese Reviews For Hotel Industry
9	Research On Song Sentiment Classification Method For Chinese Lyrics
10	Text Analysis Of Speech Synthesis Based On Statistical Parameters Of Tibetan Language In Specific Fields