Font Size: a A A

Research Of First-order Optimization Algorithms In Deep Learning

Posted on:2022-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:H YangFull Text:PDF
GTID:2480306338969589Subject:Mathematics
Abstract/Summary:PDF Full Text Request
As an emerging information technology,deep learning has been widely used in various fields.In a certain deep learning model,the value of parameters at the junction of neurons determines the performance of the model.In order to improve the accuracy of the model,it is necessary to continuously optimize the value of parameters in the training process.The optimization problem involved is to minimize the empirical risk function.With the continuous expansion of the data scale,the traditional first-order optimization algorithms have been unable to effectively solve the problem of empirical risk minimization.In the iterative process of stochastic algorithms,the loss function gradient of one or part of samples is selected randomly to replace the full gradient in order to reduce the calculational consumption.At present,stochastic algorithm has become the focus of optimization research in deep learning.Hence,it is necessary to design a more rapid and efficient optimization algorithm.First,we introduce the idea and principle of two kinds of first-order stochastic optimization algorithms,and discuss the differences and relations between different algorithms.The first class is stochastic gradient descent algorithm and its improved algorithms.The improved strategies can be roughly divided into three kinds:momentum acceleration,variance reduction and adaptive learning rate.Among them,the first two are mainly to correct the search direction or gradient estimation,and the third is to design the step size adaptively for different components of parameter variables.The second class is stochastic conjugate gradient algorithm.Secondly,in order to reduce the computation of the gradient,we improve the existing stochastic conjugate gradient algorithm.Then,a new variance reduction stochastic conjugate gradient descent algorithm(SCGA)is proposed.The effectiveness of the SCGA algorithm in solving the empirical risk minimization problem is demonstrated by theoretical proof and numerical experiments.Finally,considering the advantages of three-term conjugate gradient algorithm,a new class of stochastic three-term conjugate gradient algorithm is proposed.Numerical results show that the convergence rate of the new algorithm is faster than the stochastic gradient algorithms,and the performance is similar to the existing stochastic conjugate gradient algorithms.
Keywords/Search Tags:deep learning, optimization algorithm, empirical risk minimization, stochastic gradient, stochastic conjugate gradient
PDF Full Text Request
Related items