Research Of First-order Optimization Algorithms In Deep Learning

Posted on:2022-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:H Yang

Full Text:PDF

GTID:2480306338969589

Subject:Mathematics

Abstract/Summary:

PDF Full Text Request

As an emerging information technology,deep learning has been widely used in various fields.In a certain deep learning model,the value of parameters at the junction of neurons determines the performance of the model.In order to improve the accuracy of the model,it is necessary to continuously optimize the value of parameters in the training process.The optimization problem involved is to minimize the empirical risk function.With the continuous expansion of the data scale,the traditional first-order optimization algorithms have been unable to effectively solve the problem of empirical risk minimization.In the iterative process of stochastic algorithms,the loss function gradient of one or part of samples is selected randomly to replace the full gradient in order to reduce the calculational consumption.At present,stochastic algorithm has become the focus of optimization research in deep learning.Hence,it is necessary to design a more rapid and efficient optimization algorithm.First,we introduce the idea and principle of two kinds of first-order stochastic optimization algorithms,and discuss the differences and relations between different algorithms.The first class is stochastic gradient descent algorithm and its improved algorithms.The improved strategies can be roughly divided into three kinds:momentum acceleration,variance reduction and adaptive learning rate.Among them,the first two are mainly to correct the search direction or gradient estimation,and the third is to design the step size adaptively for different components of parameter variables.The second class is stochastic conjugate gradient algorithm.Secondly,in order to reduce the computation of the gradient,we improve the existing stochastic conjugate gradient algorithm.Then,a new variance reduction stochastic conjugate gradient descent algorithm(SCGA)is proposed.The effectiveness of the SCGA algorithm in solving the empirical risk minimization problem is demonstrated by theoretical proof and numerical experiments.Finally,considering the advantages of three-term conjugate gradient algorithm,a new class of stochastic three-term conjugate gradient algorithm is proposed.Numerical results show that the convergence rate of the new algorithm is faster than the stochastic gradient algorithms,and the performance is similar to the existing stochastic conjugate gradient algorithms.

Keywords/Search Tags:

deep learning, optimization algorithm, empirical risk minimization, stochastic gradient, stochastic conjugate gradient

PDF Full Text Request

Related items

1	The Optimization Algorithm Research Of Stochastic Gradient Descent Based On Convolutional Neural Network
2	Convergence Analysis Of Several Stochastic Gradient Descent Methods With Biased Stochastic Gradients
3	First-order Stochastic Algorithm For Solving Nonlinear Optimization Problems And Its Application
4	Improved Conjugate Gradient Algorithm And Its Application
5	Research On Accelerated Algorithms For Several Optimization Problems
6	Methods for Large Scale Nonlinear and Stochastic Optimizatio
7	Research On A New Conjugate Gradient Method And Spectral Gradient Methods
8	The Study Of Unconstrained Optimization Problems With Conjugate Gradient Method
9	An Outer Gradient Algorithm For Solving Stochastic Variational Inequalities
10	Study On Dai-Liao Conjugate Gradient Method And Three-term Conjugate Gradient Method