| With the boom of the Internet and Mobile Internet, data is becoming more andmore important for decision making. The concept of “Big Data†is currently changingeveryone’s daily life and the operation and marketing strategies in many industries.Internet significantly lowers the cost of obtaining large amount of data. Users can findalmost what they want on the Internet. Besides, people are linked more frequently andclosely by the emerging of Web2.0. Thus the analysis of public sentiment becomes animportant research methodology.In the Chinese stock market, most investors are individual ones. Theseindividual investors are relatively irrational and their investment activities are heavilyinfluenced by the information they obtain, even if it is a rumor. Thus the existence ofabundant individual investors adds volatility to the whole market. These individualinvestors are also active in many online stock forums, and their sentiment has a directinfluence on the stock market. Research on their activities can shed light on thefunctioning of the Chinese capital market.Based on the behavior finance and data mining theories, this paper analyzes thetext from the articles and posts in online stock forums and then calculates thesentiment indexes using the Na ve Bayesian machine learning technology. By runningmultivariate regressions, we find that these sentiment variables provide somesignificant and additional information for forecasting the return of the market in thenext trading day. Besides, the trading strategy proposed in this thesis can earn anexcess return of27.49%and14.62%in243trading days before and after the tradingcosts are taken into consideration, respectively. |