Convergence Analysis Of Probabilistic Gradient Estimation Class Algorithms With Momentum

Posted on:2024-06-27

Degree:Master

Type:Thesis

Country:China

Candidate:H Su

Full Text:PDF

GTID:2568307112489644

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

In machine learning,many studies involve solving optimization problems,stochastic gradient descent algorithm(SGD)is one of the most popular optimization algorithms,but it still has some shortcomings,such as: the generated variance causes slow convergence;The step size is relatively strict;Easy to fall into the saddle point,etc.Solving these problems to continuously optimize algorithms is the general trend.Most of the convergence analysis of stochastic gradient descent algorithm and its variants is carried out under convex or strong convex assumptions.In fact,optimization problems in deep learning are often nonconvex,so this paper realistically studies non-convex optimization problems,discussing two common types of optimization problems in deep learning: stochastic optimization problem and finite-sum optimization problems.In recent years,researchers have proposed a series of variance reduction algorithms to optimize the SGD algorithm.They take a specific form of stochastic gradient estimation,which can reduce the stochastic gradient variance and accelerate the convergence of algorithm.As an extension algorithm of variance reduction algorithm,probabilistic gradient estimator(PAGE)uses the form of probability,which overcomes the common double cycle structure in variance reduction algorithm.In addition,the well-known momentum technique can improve the performance of the algorithm in stochastic optimization.It integrates the past information into the current update through the exponential decreasing weighted moving average,which can also accelerate the convergence of the algorithm.Therefore,this paper combines the probability estimation gradient class algorithms with momentum in two different positions.The momentum version of the corresponding probability gradient estimation class algorithms are proposed,and their convergence analyses are carried out for two types of optimization problem.Furthermore,we also disscuss the complexity of stochastic gradient in the finite-sum optimization problem.It is worth mentioning that algorithms proposed in this article can be seen as a framework.When specific parameters are selected within the framework,the algorithms can degenerate into existing algorithms of which the convergence results are consistent with existing theories.In other words,they can be seen as special cases of the algorithms in this article.

Keywords/Search Tags:

Probabilistic gradient estimator, Momentum, Convergence, Machine learning

PDF Full Text Request

Related items

1	PID-based Optimization Method With Applications
2	Convergence Analysis Of Non-convex Stochastic Optimization Algorithms In Deep Learning
3	Convergence Of Gradient Method With Momentum
4	Convergence Analysis Of Gradient Learning Methods For Feedforward Neural Networks
5	Efficient Transmission Of Wireless Over-the-air Collaborative Learning Models
6	Convergence Of Online Gradient Method Of BP Neural Network With Momentum
7	Research On Improving The Convergence Performance Of Stochastic Gradient Descent In Distributed Machine Learning
8	Convergence Analysis Based On Momentum BP Algorithm
9	Some Gradient Learning Algorithms For Pi-Sigma Neural Networks
10	Research On Training Algorithm Of Restricted Boltzmann Machine Under Classification Rate Criterion