| With the increasing demand of consumers for accommodation,people are no longer satisfied with the single hotel mode.At the same time,there are a large number of idle houses in China,which provide a basis for the development of online short rent.In recent years,many online short rental platforms have sprung up,such as Xiaozhu and Tujia net.This kind of accommodation is different from traditional hotel accommodation,which means that its housing characteristics are quite different from those of traditional hotels.At a time when the sharing economy is booming today,this dissertation combines the living habits of domestic residents,crawls the real data of the Tujia net,uses relatively large-scale and multidimensional data to fit the online short rental housing sales,studies the main factors affecting the housing sales to supplement the theoretical research of previous scholars on online short rent.This is very important to promote the development of the online short rental market.This dissertation first reviews the previous literature on the online short rental market,and then elaborates the theoretical basis of Lasso,GBRT and XGBoost methods.After analyzing the differences of online short rental platforms in China,we choose Tujia as the research platform,and crawl the relevant data of housing in Chengdu,Shanghai,Beijing and Guangzhou through the Python program,divide the emotions of online reviews of housing in the four regions into three categories: positive,neutral and negative.We classify the word segmentation and word frequency for the positive and negative comment texts,classify the words in the same category,and get the characteristics that users pay attention to and summarize them.The eight characteristics that we eventually get include housing attributes,landlord attributes,cleanliness,location,management services,facilities services,satisfaction and housing characteristics.And then we build a suitable indicator system based on the above characteristics.In addition,this dissertation considers that online short rent is an experiential consumption.Users may have more intuitive feelings about the real uploaded photos.Therefore,we add the ratio of the number of comments containing photos in the online comments to the total number of comments representing the degree of user feedback.The higher degree of feedback,the more comprehensive information.Then,we obtain the corresponding characteristic data through the network crawler,preprocess the data and construct the Lasso-XGBoost model based on the factors whose regression coefficient is not zero by Lasso,and then construct the Lasso-XGBoost model with three single models of GBRT,RF and XGBoost and two combined models of Lasso-GBRT and Lasso-RF.The results show that the Lasso-XGBoost model has a very good fitting effect and shows obvious advantages of fitting,and the training performance of the algorithm is also very high.According to Lasso-XGBoost fitting results,among the variables affecting the sales of online short rental housing in different regions,the satisfaction index(positive rate,negative rate)and feedback index(the proportion of comments with photos in all online reviews)are relatively high.Although they rank in different order of importance for the online short rental market,they all have a very important impact on the housing sales.At the same time,due to the high importance of satisfaction and feedback index,the impact of other factors such as the single score of housing in the online short rental market is weakened. |