Font Size: a A A

Study On The Construction And Application Of Investor Sentiment Based On Text Mining

Posted on:2020-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:M Y GouFull Text:PDF
GTID:2439330596475295Subject:Finance
Abstract/Summary:PDF Full Text Request
With the development of online stock trading,information exchange and sharing,more investors choose to express their opinions,comments and expectations of the stock market or stocks on the stock forum.The rapid updating information usually involves a large number of investors,while most of the literature stays in the stored data.This thesis selects real-time stock posts of China's three official securities newspapers and eastmoney stock forum.It adopts natural language processing techniques to effectively classify post text information,and utilizes expert system's emotional value algorithm to extract investor sentiment.In addition,it also uses relevant indicators to conduct empirical research on the information content of securities-related texts.The thesis mainly includes the construction of the sentiment orientation dictionary,the extraction of investor sentiment and the empirical application.Firstly,the thesis has crawled news from China's three official securities newspapers and texts from eastmoney stock forum as text corpora.Secondly,combined with some mainstream Chinese dictionaries,the research adopts natural language processing techniques and TF-IDF word frequency statistics which build two Chinese financial sentiment orientation dictionaries that is exclusive for the securities official media and stock forum by manual and semi-automatic dictionary construction methods.Thirdly,through word bag technology and emotional dimensionality reduction,the three major newspapers' sentiment index and the investor sentiment of stocks are obtained.Finally,it is empirically applied to the stock market and individual stock returns and trading volume.Specifically,the measurement of official media information content uses the Vector Autoregression model to analyze the relationship between the three major newspapers' sentiment index and the yield of the stock market,and adopts impulse response analysis and variance decomposition to specifically study the earnings of the securities of the three major newspapers.Cross-sectional regression and time series regression is utilized to research individual stock investor sentiment,and to discover the influence of investor sentiment,number of posts and emotional consistency on individual stock returns and trading volume.According to the results of the three major newspapers,the sentiment index is negatively correlated with the stock index's return rate,but the sentiment index's contribution to the stock index's earnings has a low contribution rate.This indicates that the news reports of the three major newspapers' information rapidly diluted by the market,and China's stock market is effective in a certain degree.From the results of eastmoney stock forum,firstly,the stock investor sentiment has a significant positive relationship with the same period of return.However,the investor's ability to predict the future stock price is limited.Secondly,the stock sentiment consistency has a significant negative relationship with the volume,that is,the smaller the degree of stock sentiment consistency or the greater the degree of divergence in the same period and the future,and especially the impact on volume is greater.Thirdly,the number of stock day posts has a significant positive impact on the daily return on individual stock returns,and can affect the lagged stock returns.The main contributions of this thesis are as follows: On the one hand,in terms of data acquisition form,the existing literature mainly relies on stored text data,while this thesis uses daily crawling update information to grasp the investor's public opinion news in real time.This method of timing crawling with cloud server collection and maintenance is more conducive to improving the accuracy and timeliness of information statistics.On the other hand,the manual investment and semi-automatic construction methods were used to establish the Securities Investment Tendency Dictionary(SISD)of the exclusive securities news and Chinese Stock Forum Sentiment Dictionary(CSGSD)of the Stock Forum by natural language processing.In general,this thesis reflects the influence of investor sentiment on the stock market by studying the textual data of different media platforms,and explains the influence of securities media and the market efficiency of information disclosure.It is conducive to the implementation of securities market policies and public opinion supervision,and also provides investors with investment reference.
Keywords/Search Tags:Data mining, Sentiment analysis, Investor sentiment, China's three official securities newspapers, Eastmoney stock forum
PDF Full Text Request
Related items