Font Size: a A A

Research On Stock Market Prediction Based On Financial News Text Data Mining

Posted on:2020-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:S YongFull Text:PDF
GTID:2429330572966725Subject:Financial master
Abstract/Summary:PDF Full Text Request
As an important part of the financial market,the stock market plays a pivotal role in the financial sector.Whether in the academic field or in the field of stock investment,people are very interested in the prediction of the stock market.Most of the traditional stock market forecasting research starts from the perspective of the stock market itself,based on historical price data and trading volume of the stock market,rarely studies the relationship between news and stock market from the perspective of internet financial news.Since entering the internet era,the dissemination and acquisition of information has become more rapid and convenient.Investors are able to obtain relevant information in time through the internet financial website to assist investors in marking investment decisions.Therefore,the impact of financial news on the stock market and how to use ews information to forecast stock market has important practical significance.Based on the financial news text information,this paper studies the impact and prediction of financial news on the stock market from the perspective of text mining.Since the text information is unstructured data,it is first necessary to process the collected financial news texts,use R software to perform text segmentation on the financial news,feature extraction,extract keywords from the massive text information and obtain keyword frequency,cluster analysis of keywords and preliminary screening of keywords.Then,use Random Forest algorithm to get the importance of the keywords.The quantity of keywords is reduced based on the keywords' imoortance.After obtained the final keyword variables imported it into the model for the next analysis.The stock market proxy variable selects the Shanghai Stock Index and uses the index's rise and fall as a known two-category problem,denoted by 1 and 0 respectively.In terms of data processing,the keyword Baidu index data and the Shanghai Stock Index data were processed separately.The logartthmic Baidu index was used instead of the directly obtained Baidu index data,and the daily yield was calculated according to the Shanghai Stock Index data of each trading day.Finally,combined with the processed keywords Baidu index data and Shanghai Stock Index data,using Support Vector Machine to modeling and forecasting,and the simulation trading strategy is constructed based on the forecast results.The transaction income is compared with the index income of the same period to analyze whether the excess returns can be obtained.At the same time,as a control group,Random Forest algorithm was used to predict the model and construct the trading strategy,comparing the forecasting effect and the simulated trading income difference under the two methods.Taking into account the cyclicality of the stock market,the financial news data and the Shanghai Composite Index data of the bull market stage and the bear market stage were selected respectively to compare whether the impact of financial news on the stock market in different market conditions was different,and to seek whether there are significant difference in the forecast of stock market at different market condition.Through the above research,this paper has obtained the following four conclusions:(1)There is a relationship between the financial news text and the stock market.The stock market forecasting model based on the financial news text has achieved good predicting effect both in the bull market stage and the bear market stage.(2)In the different market conditions,the relationship between Internet financial news and stock market is quite different.The forecasting effect of stock market based on financial news text has certain differences in different market conditions.In the bull market stage,the relationship between financial news and stock market is more closely related and the prediction results of the model based on SVM algorithm and Random Forest algorithm are better than those of the two models during the bear market.(3)Different model methods will affect the forecast effect of the stock market,but the prediction effect of the SVM model is better than that of the Random Forest model in both the bull market and the bear market,the difference between the two models does not fluctuate greatly in different stock market condition.(4)The cyclical market stage in which the stock market is located will affect the trading strategy revenue constructed based on the forecast results.In the bull market stage,the trading strategy based on the forecast results shows slight advantages.The yields of the two models were slightly higher than the index return rate of the same period.In the bear market stage,the yields of the two models were higher than the index return rate of the same period.The trading returns of the two models have advantages in specific market conditions.In the bull market stage,the trading strategy based on Random Forest algorithm yield is better than the trading strategy yield under the SVM algorithm.In the bear market,the result is the opposite.The trading strategy of SVM algorithm has a significantly higher return than the trading strategy based on Random Forest algorithm.
Keywords/Search Tags:financial news, data mining, stock market, svm, rf
PDF Full Text Request
Related items