Trade-off Between Accuracy And Convergence Of Iterative Optimization Algorithms

Posted on:2024-07-19

Degree:Master

Type:Thesis

Country:China

Candidate:P Guo

Full Text:PDF

GTID:2568307061495504

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Stochastic gradient descent and its noisy variant are widely used in machine learning.By using the principle of gradient descent,this kind of iterative optimization algorithm makes the loss function smaller in the direction of the fastest descent speed.The stochastic approximation method proposed by Robbins and Monro in 1951 has an enlightening effect on the subsequent development of stochastic gradient algorithm.In the modern big data environment,SGD is an important method for training neural networks,processing largescale data sets,optimization,etc.Deeply welcomed in various fields.Algorithm-related overall performance is the core of statistical learning theory.Inspired by Chen et al.,the sum of generalization error and optimization error can be used to measure the overall performance of the algorithm.In fact,generalization error and optimization error measure overfitting and underfitting.In this paper,we discuss the stability and convergence of deep learning iterative optimization algorithms such as SGD.The stability of the algorithm means that the output of the algorithm is not too dependent on any single training example.The stability of the algorithm is closely related to its generalization ability.It is of theoretical and practical significance to estimate the generalization error quantitatively.For the stochastic gradient method,it is of great theoretical significance to discuss the influence of SGD on generalization error in the case of non-convex learning objectives,and it is crucial to the generalization error of deep learning.Different from the existing studies,we are concerned with the generalization error bounds of the weighted output of the iterative algorithm and the size of the optimization error.We discuss that the loss function is subgaussian.Based on the ideas of standard setting,random subset and duplicate sample,we infer the generalization bound of the iterative algorithm by means of the tools of unconditional mutual information,separated mutual information and conditional mutual information.Finally,these bounds are applied to real algorithms.To verify the results of the analysis,we conducted experiments on the dataset of MNIST and CIFAR-10.Numerical simulation results show that the weighted output is always better in non-convex optimization problems.

Keywords/Search Tags:

Excess risk, Generalization error, Optimization error, Stochastic gradient descent, Non-convex learning

PDF Full Text Request

Related items

1	Dynamic Regret Of Online Gradient Descent:Analyses And Applications
2	Optimization Algorithms Of Neural Networks Weights Based On Stochastic Gradient Descent
3	The Reseach And Application Of Stochastic Gradient Descent And Dual Coordinate Descent Algorithm
4	A Research Of Stochastic Gradient Descent Algorithm
5	Co-phasing Error Correction Of Segmented Primary Mirrors Based On Focal Plane Intensity
6	Research On Optimization And Application Technology Of Gradient Descent Algorithm In Deep Learning
7	A Research And Application On Stochastic Gradient Descent Algorithm In Distributed Cluster
8	Research On Distributed Stochastic Gradient Descent Algorithm
9	A Ranking Algorithm ListNet Based On Stochastic Gradient Descent
10	Research On Improving The Convergence Performance Of Stochastic Gradient Descent In Distributed Machine Learning