Font Size: a A A

Classification and time series forecasting: Applications in the stock market

Posted on:2017-02-27Degree:M.SType:Thesis
University:Southern Methodist UniversityCandidate:O'Connor, William BFull Text:PDF
GTID:2469390011997684Subject:Computer Science
Abstract/Summary:
In this thesis, we evaluate the effectiveness of time series analysis methods and classification methods in terms of their ability to forecast future stock market values. The differences between the use of time series and classification tools, and the resulting differences in their respective models, are explained. Three specific models from each field are examined in terms of their conceptual and mathematical bases. Three econometric models examined are the Classical Linear Regression Model, the Autoregressive Moving Average model, and the Vector Autoregression model. The three classification models are the Support Vector Machine, Random Forest, and Artificial Neural Network.;After describing the differences between the modeling methods, a model of each type is implemented and used to evaluate financial time series data for 10 publicly traded companies. The models predict the sign of the returns in the next period, and each model is evaluated based on the output it provides. Fitted values from the models are evaluated based on whether the predicted return is realized within the next few trading sessions with 2-day, 3-day, and 5-day periods used for the evaluation. Where appropriate, models of data sets that are better suited to the given model are also demonstrated.;The models are compared according to a number of criteria, most critically according to the kappa statistics that they achieve. Kappa indicates the superiority of the prediction accuracy over a random predictor --- one that simply predicts the most common outcome from the data set. Additionally, the models are evaluated according to how well they predict large returns, how well they predict in periods of high volatility, and how much profit a trader could attain using the recommendations of the model. The results show that, on average, the random forest models achieve the highest kappa for each of the three period lengths, and the ARMA models would to achieve the highest profits if trades of equal value are placed for each recommendation. Across all stocks modeled and evaluated over 3 day periods, the different methods all produce kappa values near 30%. The random forests, on average, achieve a kappa of 36%, and the ARMA's achieve an average kappa of 31%. As a method for building models, the ARMA proves much more consistent, whit a standard deviation of kappa rates of 0.05, compared to 0.10 for random forests. The best method to use depends on the data used as inputs, and for the stocks analyzed in this thesis, ARMA models are best for low volatility stocks; the opposite is demonstrated for random forests.
Keywords/Search Tags:Time series, Models, Classification, Random forests, ARMA, Methods
Related items