Font Size: a A A

Research On The Relationship Between Web Search Data And Traditional Data

Posted on:2020-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y J QiuFull Text:PDF
GTID:2439330623452478Subject:Quantitative Economics
Abstract/Summary:PDF Full Text Request
In this paper,the multi-fractal spectrum theory is used to analyze the interaction between keyword index and CPI,and the Chow-Lin model and Extra Trees model are used to achieve the weekly prediction of CPI year-on-year data.First,we designed an experimental comparison of three time series translocation models and found that the Chow-Lin model has the best performance,and its MSE is 0.077645.Then,we used the Chow-Lin model to perform weekly conversion of CPI.The P value of the t-test is 0.54,that is,the "month" value of the weekly data after the frequency conversion is not significantly different from the real CPI monthly data.Secondly,we use time difference correlation analysis to screen 23 keywords from 176 original words and analyze them using multifractal spectrum theory.For weekly data,we find that the long-term interaction between 80% keywords and CPI has long-term persistence and low multi-fractal degree.The short-term interaction has greater volatility and a greater degree of multi-fractal.The intersection of long-term and short-term interval is distributed between 8 and 16 months.The average of 23 keyword intersections is about 11 months.On average,the interaction between keyword data and CPI is more than 11 in the continuous interval.It has stability at the time of the month.The monthly data is similar in pattern,but the average of the intersections is about 17 months.Finally,we use Extra Trees to build a model that not only allows for monthly forecasting,but also predicts CPI weekly one month in advance.By leave-one-out cross validation,we found that the Extra Trees model performed best when the number of principal components was 8,and its MSE was only 0.07784.For the weekly data test set,the MSE is 0.11314024,and the MPE is 0.00234,that is,the predicted value deviates from the true value to less than 0.235%.For the real monthly data,the predicted MSE is 0.12172161,and the MPE is 0.00241,that is,the predicted value pair.The relative deviation of the true monthly CPI value is less than 0.25%.In summary,taking CPI as an example,we studied the relationship between web search data and traditional data from two perspectives.From the perspective of interaction relationship,the conclusion of multifractal spectrum analysis indicates that the web search data and CPI have long-term stable interaction relationship;From the perspective of predictive power,our conclusions fully demonstrate that real-time effective prediction of traditional data can be achieved by using web search data.In particular,based on the Chow-Lin model,this paper first extended the analysis of the two sequences from the "month" frequency to the "week" frequency.
Keywords/Search Tags:Consumer Price Index, Chow-Lin model, Multifractal spectrum, Extremely randomized trees, Weekly forecast
PDF Full Text Request
Related items