Font Size: a A A

Analysis On The Satisfaction Of Homestay Tenants Based On Text Mining

Posted on:2024-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:M Y TaoFull Text:PDF
GTID:2568307052481614Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the booming tourism industry and online travel platforms,tourists are more and more inclined to post their true comments online.These comments contain a lot of useful information,which can directly reflect the real experience of the guests.The subsequent tourists will refer to the previous comments when choosing accommodation.For the B&B operators,the online comments can provide them with real-time feedback of the guests’ opinions and provide business suggestions to promote further improvement.Therefore,it is very useful to analyze the online comments and extract the effective information in the comments.This thesis takes the online reviews of homestay in popular tourist cities in Yunnan province on Ctrip platform as the research object,and adopts the clustering,keyword extraction,emotion analysis and topic analysis of text mining technology to study the guest satisfaction of homestay reviews.The main research contents and conclusions of this thesis are as follows:(1)Select Ctrip from each online travel platform and crawl the homestay information and comment information of Dali,Lijiang,Xishuangbanna and Shangri-La,four popular tourist cities in Yunnan Province,among which there are149,473 comments and 131,708 after data cleaning.jieba and pyltp were adopted to separate the comments after cleaning,combined with the constructed word list and the custom word dictionary,and then the featured words were obtained and word frequency statistics were performed.The results show that the most concerned aspects of the tenants are room,boss,service,hostel location and hygiene.(2)Text vectozation was carried out on all comments after data preprocessing.After dimensionality reduction,text clustering was carried out by k-means,hierarchical clustering and DBSCAN respectively.The best k-means clustering effect was obtained by three clustering evaluation indexes including contour coefficient,CH score and DBI value(k=3).The clustering results were divided into geographical location,the service of the host or hostel,and the internal and external conditions of the room.Keyword extraction was carried out by TF-IDF and Text Rank to obtain the keywords of all review texts and their TF-IDF and Text Rank values.Then,keywords of review texts in different cities were extracted respectively to obtain the focus points of guests in each city.(3)After word segmentation and partof speech tagging of the online comment text,the emotion score is calculated based on the sentiment dictionary of Know.Finally,the positive,negative and neutral sentiment tendency of the comments in the popular tourist cities of Yunnan province are 95%,2% and 3% respectively,and the emotion classification accuracy rate based on the sentiment dictionary is as high as89%.Dali,Lijiang,Xishuangbanna and Shangri-La scored 18.1815,18.5480,18.1647 and 17.0332,respectively.After the unbalanced treatment of positive and negative comment texts after discriminating emotional inclination,the method of oversampling and downsampling was combined for processing.Logistic regression,naive Bayes,SVM and decision tree models were constructed respectively for binary classification.Finally,the accuracy,accuracy,recall rate and F1 score of SVM classifier were all the highest.Classification effect is the best and most stable.(4)The LDA model is adopted to analyze the theme of the comment text,and it is obtained that the comments of the guests on the four cities in Yunnan Province are mainly divided into five aspects: service quality,external environment,internal situation,geographical location,and sense of experience.Combining keyword extraction and theme extraction results to determine the first-level theme and feature words of guest satisfaction,construct an index system,and build a satisfaction calculation formula based on TF-IDF value.The satisfaction scores of Dali,Lijiang,Xishuangbanna and Shangri-La are 26.7524,28.7172,26.3632 and 26.5128 respectively.Lijiang has the highest satisfaction score among the four cities and towns.In the first-level theme,the satisfaction index of the room interior was the highest,and the satisfaction score of the feature words "room" and "experience" was the highest.
Keywords/Search Tags:Text mining, Keyword extraction, Sentiment analysis, Topic analysis, the satisfaction of homestay tenants
PDF Full Text Request
Related items