Font Size: a A A

Inference Of Online Approach To Non-parametric Smoothing Of Big Data

Posted on:2022-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ChenFull Text:PDF
GTID:2480306323979569Subject:Statistics
Abstract/Summary:PDF Full Text Request
"Big data" refers to large,fast,or complex data,which is difficult or impossible to process by traditional methods.The concept of big data was developed in the early 2000s,Schonberg and Kukier introduced the term "big data" for the first time in the"Big Data Era".Big data is characterized by large amount,high speed,variety,low value density and so on.Big data analysis suffers from two computational problems.One is that the data may be too large to be stored in computer memory.The second is that the computa-tion is too heavy.These barriers can be addressed by newly developed statistical or computational methods.Recent big data approaches can be broadly divided into three categories:subsampling,divide-and-conquer,and sequential updating method.The former two methods have good effects on the analysis of big data,but they are all based on the existing data processing methods,and cannot be well applied to the processing of stream data.Due to the high diversity of stream data,kernel density estimation and kernel regression estimation,as the representatives of non-parametric methods,are very suitable for estimation and prediction because they do not need to make too many as-sumptions about the distribution of data.The online model uses the obtained data to estimate the bandwidth,so as to avoid the tedious task of repeatedly calculating the bandwidth when the stream data arrives each time.In this paper,we prove the asymptotic properties of the online kernel density and online regression models,and make statistical inference on them,also establish hypoth-esis tests.In addition,algorithms are constructed to solve the difficulty of the estimation of the bandwidth parameter c in the online kernel density and regression model,which are of great importance to the generalization of the model.In the simulation,the asymp-totic normality of the online density model and the local linear model is verified,and the online linear regression model is applied to the prediction of the volatility index(VIX).The empirical results show that,compared with the traditional local linear predictive regression model,the proposed model has the same performance in predicting the con-tinuously arriving option data streams,but the online model significantly reduces the computational complexity of the model.
Keywords/Search Tags:bandwidth parameter, kernel estimator, online updating estimation, sta-tistical inference, VIX prediction
PDF Full Text Request
Related items