Research On Implicit Stochastic Gradient Descent Method

Posted on:2020-12-10

Degree:Master

Type:Thesis

Country:China

Candidate:X R Zhang

Full Text:PDF

GTID:2370330575480392

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

In the fields of statistics and machine learning,we often optimize the objective function to solve parameter estimation problem.However,with the development of computers and information technology,there are sometimes optimization issues that need to deal with millions or even billions of training samples.The ability of machine learning is limited by computation time rather than sample size.Classical optimized estimation method which are widely used cannot be applied to these large-scale modern data sets.Based on this situation,the stochastic gradient descent method becomes a very popular method.This is a recursive estimation method.It is only necessary to update the model parameters with a small amount of data every time rather than to traverse all the data sets.So,it is convenient to estimate a large amount of data.However,the traditional stochastic gradient descent method is numerically unstable.If smaller parameters are selected,the convergence rate will become slower;if larger parameters are selected,it will lead to large asymptotic variance or may also cause numerical divergence.Therefore,we need to carefully select the value of the learning rate parameter.Some methods for selecting learning rate parameters are continuously proposed and improved.This paper will explore a new method based on standard stochastic gradient descent method,'which adopts the idea of implicit update to improve the traditional method.We call it implicit stochastic gradient descent method.In order to facilitate the distinction,the traditional stochastic gradient descent method is called explicit stochastic gradient descent method.In this paper,the explicit stochastic gradient descent method and the implicit stochastic gradient descent method are used to estimate the parameters of two commonly used statistical models.To ensure the comprehensiveness and reliability of the results,a linear regression model and a logistic regression model are selected respectively.The linear regression model is regression problem,and the logistic regression model is classification problem.And the most common and classic methods lm(.)and glm(.)in R language statistical software are used as benchmark method.It can be seen from the results that the two stochastic gradient descent method greatly reduce the execution time compared to the conventional method,and thus are more suitable for large-scale data sets.Specifically,under the three parameter selection methods,the explicit stochastic gradient descent method exhibits instability,so the parameters need to be carefully selected in the actual application.In contrast,the implicit stochastic gradient descent method is very stable under the three parameter selection methods.Therefore,this paper suggests that the implicit stochastic gradient descent method is better and deserves further research and attention.

Keywords/Search Tags:

Stochastic approximation, Implicit updates, Stochastic gradient descent, Large data sets

PDF Full Text Request

Related items

1	Averaging Projected Stochastic Gradient Descent for large scale least square problem
2	Convergence Analysis Of Several Stochastic Gradient Descent Methods With Biased Stochastic Gradients
3	Mini-batch Stochastic Gradient Descent Method Applied To Seismic Tomography
4	Study Of Stochastic Gradient Descent Batch Optimization Based On Information Transmission Maximization Criteria
5	The Optimization Algorithm Research Of Stochastic Gradient Descent Based On Convolutional Neural Network
6	Stochastic Gradient Descent Algorithm For Non-Negative Matrix Factorization
7	Optimal Control Of Stochastic Systems:Theory,Numerics,and Applications
8	Application Of Stochastic-Parallel-Gradient-Descent Adaptive Optics Techniques In Beam Cleanup
9	Research On Wave-front Correction Technique Based On The Stochastic Parallel Gradient Descent Algorithm
10	Stochastic Analysis On Manifold And Its Application