Font Size: a A A

Iterative Divide-and-conquer Method Of Estimating Index Coefficients In Single-index Model Under Massive Data

Posted on:2022-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y X PengFull Text:PDF
GTID:2480306779963529Subject:Investment
Abstract/Summary:PDF Full Text Request
This paper studies the inference problem of index coefficient in single-index models under massive dataset.Analysis of massive datasets is challenging owing to formidable computational costs and/or memory requirements.A nature method is the averaging divide-and-conquer approach,which splits data into several machines,obtains the estimators for each machine,and then aggregates the estimators via averaging.However,there is a restriction on the number of machines.To overcome this limitation,this paper proposed a computationally coefficient method,which only requires an initial estimator on one machine and then successively refines the estimator via multiple rounds of aggregations.The proposed estimator achieves the optimal convergence rate without any restriction on the number of machines.We present both theoretical analysis and experiments to explore the properties of the proposed method.The chapters of this article are arranged as follows:Chapter 1: Briefly describe the concept,development history and research status of massive data,regression models,divide-and-conquer method.List some application cases of scholars in related fields.Chapter 2: Review the complete data estimation method of the single index model;derive formulas based on the realization of iterative divide-and-conquer method,and reproduce the solution process of the iterative divide-and-conquer method with algorithm logic;for the assumptions,conditions and theorems involved in the implementation of the iterative divide-and-conquer method,the derivation and proof process are given to verify the effectiveness and accuracy of the method.Chapter 3: Divided into two parts: simulation comparison and real data analysis.In terms of simulation,the overall sample number and the single machine sample number are respectively fixed to study the impact of different machine numbers M.Besides,consider the impact of different overall sample numbers;for real data analysis,through the combined cycle power plant data and airline punctuality data two sets of data sets,the performance of different methods are compared.
Keywords/Search Tags:Single-index Model, Massive Data, Divide-and-conquer, Semiparametric Estimation
PDF Full Text Request
Related items