Font Size: a A A

Financial Data Analysis Based On Parallel Statistical Computing

Posted on:2013-01-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:G B GuoFull Text:PDF
GTID:1119330374980702Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Modern computer systems are now even more powerful to make many common statistical computations literally instantaneous. However, some im-portant situations still exist where a result can require days to compute, es-pecially, statistical inference in large-sample mass data or as sampling large complex data. As a result, either faster but less accurate methods are used, or potentially important computations are skipped entirely. Thus, the devel-opment of parallel statistical computations is very important.In this thesis we report the results of our studies on speedup opportunities in financial data, including wage data, bankruptcy data and pension fund data. We find that good speedup and other performances are available for large statistical inference problems. This dissertation consists of five chapters, whose main contents are described as follows:Chapter One Parallel statistical computing is a very interesting prob-lem:there are many stats calculations are embarrasingly parallel in statistics, such research thus appears to play an important role in the interface between parallel and statistical computation. This chapter is concerned with paral-lel statistical computing in regression problems, nonparametric inference, and stochastic processes. In particular, we review the methods parallel multisplit-ting method, parallel method for least squares in linear regressions, and parallel statistical computing in multiple linear regression; the theoretic framework of parallel bootstrap in nonparametric inference; parallel methods for Markov Chain, and parallel Markov chain Monte Carlo. It is very important that we survey the non-graphics applications on GPUs. We conclude that there is a need for further research in parallel statistical computing, and describe some of the important unsolved problems.Chapter Two For performing multiple linear models, chosen subsets and run-time are important questions. To solve them, we introduce a new parallel maximum likelihood estimator for multiple linear models. We first give an equivalent condition between the method and generalized least squares esti-mator, and consider rank of projections and eigenvalue. We then present error of it when there exists a stable solution. Some theorems of the error are given in the paper. In addition, we use the proposed method to fit bankruptcy data, obtain an estimator equation of the data sets, and report the execution time of the method by two simulation data sets.Chapter Three We explore the convergence theories of multiplicative Schwarz method and damped additive Schwarz method for the solution about GLMs and GAMs with large sample sizes. For GLMs and GAMs with large samples, we suggest Schwarz methods for the QL and the the penalized QL. The Schwarz methods use a sequence of sub-models, each sub-model corre-sponding to a subset of the components of δ, the sub-models being patched together to yield the solution for the full model. The technique might be useful for model comparison, where the fitted values from a sub-model are used as starting values for a larger model.Chapter Four Parallel bootstrap is an extremely useful statistical method, with good timing performance. However, the theoretical study of the method is not present. In the chapter, we introduce a working correlation matrix about the method, called parallel bootstrap matrix. We consider some properties of the resampling, and related optimal subsample lengths in smooth function models. We also present the timing performance of parallel bootstrap estima-tors, and some performance results of subsample length selection on finance time series data.Chapter Five We study computational schemes for quasi stationary distributions of Markov chains, having matrices which are quasi stochastic, i.e., all of their row sums arc less than or equal to one. We develop Schwarz methods for the corresponding distributions. In particular, we get the semiconvergcnce of additive and multiplicative Schwarz methods, and that of two level Schwarz iterative methods for the quasi stationary distributions (QSDs). We provide two examples of Markov chains with QSDs, to explain our methods.
Keywords/Search Tags:statistical computing, regression, nonparametric inference, stochas-tic processes
PDF Full Text Request
Related items