Font Size: a A A

Estimation And Application Of Model Uncertainty In Partial Least Squares

Posted on:2024-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:B B XiFull Text:PDF
GTID:2530307157488034Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In the era of big data,mobile intelligent devices,cloud and the Internet of Things have collected massive data,which contains a lot of useful information.Therefore,it is very necessary to save,analyze and calculate these massive data.Modeling is a way to conduct in-depth analysis of data,but when the established model is fitted with data,the data limitation will generate sampling error to the fitted model.Therefore,in real life,the model established under the assumption of fitting data is the "real" model representing the data,which does not express the uncertainty of the input model caused by such sampling error.This uncertainty of the input model will affect the output analysis,it may lead to major errors and suboptimal decisions.Therefore,this paper studies the quantification of the uncertainty of the input model.Error variance estimation is the main form of quantitative model uncertainty.In this paper,the model uncertainty problem under high dimensional data is studied by error variance estimation.As is known to all,high false correlation exists in high-dimensional data.Therefore,fitting cross validation(RCV)in the existing estimation error variance method can well solve the high false correlation in high-dimensional data,making the error variance estimation as good as the performance of oracle estimator.It is a very good choice to use it to quantify the model uncertainty in high-dimensional data.However,RCV cannot be modeled under non-sparse conditions,and the Lasso model used in variable selection cannot introduce highly correlated variables into the model.Therefore,this paper proposes two improvement methods based on the idea of RCV: The RCV method based on sparse partial least squares and the RCV method based on partial least squares can well solve the problems in RCV.In this paper,two improved methods based on RCV and the construction of error variance estimation and prediction interval for high dimensional data are presented.Simulation results show that the error variance estimation of the improved method is closer to the real error variance than that of the RCV method and is as good as that of the oracle estimator.The improved method and the error variance estimated by the RCV method are applied to the construction of the prediction interval.The empirical analysis shows that,under the same confidence level,the improved method not only relaxes the assumptions in the RCV estimator,but also has a very good performance in the prediction interval and coverage,achieving the expected effect.Therefore,the method proposed in this paper is better.
Keywords/Search Tags:model uncertainty, RCV, partial least squares, sparse partial least squares, variance estimation, forecast interval
PDF Full Text Request
Related items