Convergence Analysis Of Several Stochastic Gradient Descent Methods With Biased Stochastic Gradients

Posted on:2022-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Luo

Full Text:PDF

GTID:2480306491459964

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

Stochastic gradient methods are a simple and efficient method for solving large-scale optimization problems and have been widely applied in machine learning and deep learning.However,in contrast to the proliferation of new algorithms,the theoretical developments on stochastic gradient methods have struggled.The current analysis of the convergence of stochastic gradient methods is mostly based on the assumption that the stochastic gradient is an unbiased estimate of the gradient of the objective function,although this condition is always found to be not easily satisfied in practice.On the other hand,the stepsizes chosen in most theoretical analyses is decreasing stepsizes satisfying the condition proposed by Robbin and Monro,which is inconsistent with the common approach in practice.The deviation of the assumed conditions from the actual situation may lead to failure of practical guidance of the theoretical results.It has been shown that in solving strongly convex optimization problems,the convergence rate of stochastic gradient descent(SGD)under unbiased gradient estimation can be effectively improved by ?-suffix averaging procedure.However,previous studies have argued that the algorithm SGD-? obtained by applying the procedure cannot be computed on-the-fly.We generalize ?-suffix averaging to a more general form,rounding ?-suffix averaging,to compute on-the-fly and obtain the SGD-r? algorithm by applying it to SGD.Meanwhile,the convergence analysis of SGD,SGD-? and SGD-r? is presented in this paper under the assumption that the gradient estimation is biased,considering different stepsize schemes,respectively.At last but not least,numerical experiments are performed in this paper in real world,and the experimental results verify the effectiveness of the algorithm and the theoretical analysis.

Keywords/Search Tags:

Stochastic gradient method, Biased gradient estimation, Averaging procedure, Convergence, Deep learning

PDF Full Text Request

Related items

1	Research Of First-order Optimization Algorithms In Deep Learning
2	Convergence Of Gradient Learning Algorithm For Two Kinds Of Feedforward Neural Networks
3	The Optimization Algorithm Research Of Stochastic Gradient Descent Based On Convolutional Neural Network
4	Research On A New Conjugate Gradient Method And Spectral Gradient Methods
5	Averaging Projected Stochastic Gradient Descent for large scale least square problem
6	Research On Gradient Methods With Approximately Optimal Stepsizes
7	Gradient Method Of Theoretical And Numerical Behavior Of Several Classes Of Conjugated Spectrum
8	A Stochastic Variance Reduction Gradient Method With Adaptive Learning Rate
9	Global Convergence Of Two Classes Of Modified Conjugate Gradient Methods
10	Convergence Of Non-convex Stochastic QHM Algorithm Based On ODE Method