| The advancements in technology have revolutionized the dissemination of tourism information,and the content of travelogues shared via social media has emerged as a significant source of influence.Travelers tend to refer to travelogues for informative travel information to aid their travel decisions.Therefore,studying sentiment tendencies in travelogue texts is imperative for practical applications.Although research on sentiment analysis of travel reviews has been effective,further development is required for sentiment analysis of travelogue texts.These texts are complex and lengthy with an uneven distribution of sentiments,making it challenging to obtain accurate sentiment analysis results.Thus,this thesis proposes a research method for sentiment analysis of travelogue texts based on the Longformer model.This approach aims to address the gaps in current sentiment analysis of travelogue texts and the limitations in processing lengthy travelogue texts.The primary focus of this thesis is the following research and analysis.1)To address the lack of a publicly available sentiment dataset for travelogue texts,this research has designed a dataset for analyzing the sentiment tendencies in such texts.To obtain the dataset,the thesis has selected the Ctrip platform and used crawlers to collect travelogue text contents from various cities in different regions.After data pre-processing,text exploration,and sentiment annotation,the research has successfully obtained 3444 valid travelogue text data.This dataset comprises1896 positive travelogue texts and 1548 negative travelogue texts.This dataset is a crucial foundation for subsequent research,and it will facilitate a better understanding of the sentiment tendencies in travelogue texts.2)This research explores the application of text pre-training models in the tourism field,with a particular focus on the Longformer pre-training model.For the first time,this study introduces the Longformer pre-training model to the tourism domain and compares its performance with the BERT model on short travel reviews and long travelogue texts.The experimental findings demonstrate that the Longformer model surpasses the BERT model in feature extraction of both travel review and travelogue texts.To underscore the efficacy of the Longformer model on travelogue texts,this study also examines and contrasts the Ro BERTa,ALBERT,and XLNet models in the travel domain,and the experimental outcomes reveal that the Longformer model outperforms the other pre-trained models in travel text sentiment classification dataset.Consequently,the Longformer model holds great potential in the tourism domain and can help achieve more accurate and comprehensive sentiment classification of travelogue texts.3)This research proposes a sentiment classification model for travelogue texts based on the Longformer pre-trained model.To address the issue of incomplete information extraction by Longformer due to its local attention mechanism,this study explores the combination of Longformer with various recurrent neural network modules to enhance the model’s feature extraction ability.The experimental results demonstrate that the combination of Longformer with RNN and LSTM does not significantly improve the model’s performance.However,the combination of Longformer with GRU,Bi LSTM,and Bi GRU all improve the model’s performance.Particularly,the combination of Longformer and Bi GRU has the best effect.Ablation experiments further confirm the necessity of Bi GRU neural network feature extraction.Therefore,the sentiment classification model proposed in this research has significant practical value in handling issues such as long travelogue texts and uneven sentiment distribution,and the combination of Longformer and Bi GRU is an excellent choice for the model. |