Comparison of regression and ARIMA models with neural network models to forecast the daily streamflow of White Clay Creek

Posted on:2012-09-15

Degree:Ph.D

Type:Dissertation

University:University of Delaware

Candidate:Liu, Greg Qi

Full Text:PDF

GTID:1459390008992700

Subject:Applied Mathematics

Abstract/Summary:

Linear forecasting models have played major roles in many applications for over a century. If error terms in models are normally distributed, linear models are capable of producing the most accurate forecasting results. The central limit theorem (CLT) provides theoretical support in applying linear models.;During the last two decades, nonlinear models such as neural network models have gradually emerged as alternatives in modeling and forecasting real processes. In hydrology, neural networks have been applied to rainfall-runoff estimation as well as stream and peak flow forecasting. Successful nonlinear methods rely on the generalized central limit theorem (GCLT), which provides theoretical justifications in applying nonlinear methods to real processes in impulsive environments.;This dissertation will attempt to predict the daily stream flow of White Clay Creek by making intensive comparisons of linear and nonlinear forecasting methods. Data are modeled and forecasted by seven linear and nonlinear methods: The random walk with drift method; the ordinary least squares (OLS) regression method; the time series Autoregressive Integrated Moving Average (ARIMA) method; the feed-forward neural network (FNN) method; the recurrent neural network (RNN) method; the hybrid OLS regression and feed-forward neural network (OLS-FNN) method; and the hybrid ARIMA and recurrent neural network (ARIMA-RNN) method. The first three methods are linear methods and the remaining four are nonlinear methods. The OLS-FNN method and the ARIMA-RNN method are two completely new nonlinear methods proposed in this dissertation. These two hybrid methods have three special features that distinguish them from any existing hybrid method available in literature: (1) using the OLS or ARIMA residuals as the targets of followed neural networks; (2) training two neural networks in parallel for each hybrid method by two objective functions (the minimum mean squares error function and the minimum mean absolute error function); and (3) using two trained neural networks to obtain respective forecasting results and then combining the forecasting results by a Bayesian Model Averaging technique. Final forecasts from hybrid methods have linear components resulting from the regression method or the ARIMA method and nonlinear components resulting from feed-forward neural networks or recurrent neural networks.;Forecasting performances are evaluated by both root of mean square errors (RMSE) and mean absolute errors (MAE). Forecasting results indicate that linear methods provide the lowest RMSE forecasts when data are normally distributed and data lengths are long enough, while nonlinear methods provide a more consistent RMSE and MAE forecasts when data are non-normally distributed. Nonlinear neural network methods also provide lower RMSE and MAE forecasts than linear methods even for data that are normally distributed but with small data samples. The hybrid methods provide the most consistent RMSE and MAE forecasts for data that are non-normally distributed.;The original flow is differenced and log differenced to get two differenced series: The difference series and the log difference series. These two series are then decomposed based on stochastic process decomposition theorems to produce two, three and four variables that are used as input variables in regression models and neural network models.;By working on an increment series, either difference series or log difference series, instead of the original flow series, we get two benefits: First we have a clear time series model. The secondary benefit is from the fact that the original flow series is an autocorrelated series and an increment series is approximately an independently ditributed series. For an independently ditributed series, parameters such as Mean and Standard Deviation can be calculated easily.;The length of data during modeling is in practice very important. Model parameters and forecasts are estimated from 30 data samples (1 month), 90 data samples (3 months), 180 data samples (6 months), and 360 data samples (1 year).

Keywords/Search Tags:

Models, Neural network, ARIMA, Data, Forecasting, Linear, Series, Regression

Related items

1	Forecasting and selling futures using ARIMA models and a neural network
2	Wavelet Neural Network And Its Application To Forecasting For Residential Buildings Market
3	Forecasting RMB Exchange Rate With A Hybrid ARIMA And GRNN Model
4	Study On Modeling And Forecasting In The Mobile Network By Time Series Analysis Technology
5	Methods And Techniques On The Short-term Forecasting Of Time Series
6	Prediction Of Stock Price Based On LS-SVM
7	Futures Prices Forecasting Based On Neural Network Models
8	Research On Demand Forecasting Model Of ARIMA-BP Neural Network Based On CPFR
9	Research On Demand Forecasting Model Of Arima-bp Neural Network Based On Cpfr
10	Research On The Application Of Data Mining Technology In Human Resource Demand Forecasting