Font Size: a A A

Research On Implicit Stochastic Gradient Descent Method

Posted on:2020-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:X R ZhangFull Text:PDF
GTID:2370330575480392Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In the fields of statistics and machine learning,we often optimize the objective function to solve parameter estimation problem.However,with the development of computers and information technology,there are sometimes optimization issues that need to deal with millions or even billions of training samples.The ability of machine learning is limited by computation time rather than sample size.Classical optimized estimation method which are widely used cannot be applied to these large-scale modern data sets.Based on this situation,the stochastic gradient descent method becomes a very popular method.This is a recursive estimation method.It is only necessary to update the model parameters with a small amount of data every time rather than to traverse all the data sets.So,it is convenient to estimate a large amount of data.However,the traditional stochastic gradient descent method is numerically unstable.If smaller parameters are selected,the convergence rate will become slower;if larger parameters are selected,it will lead to large asymptotic variance or may also cause numerical divergence.Therefore,we need to carefully select the value of the learning rate parameter.Some methods for selecting learning rate parameters are continuously proposed and improved.This paper will explore a new method based on standard stochastic gradient descent method,'which adopts the idea of implicit update to improve the traditional method.We call it implicit stochastic gradient descent method.In order to facilitate the distinction,the traditional stochastic gradient descent method is called explicit stochastic gradient descent method.In this paper,the explicit stochastic gradient descent method and the implicit stochastic gradient descent method are used to estimate the parameters of two commonly used statistical models.To ensure the comprehensiveness and reliability of the results,a linear regression model and a logistic regression model are selected respectively.The linear regression model is regression problem,and the logistic regression model is classification problem.And the most common and classic methods lm(.)and glm(.)in R language statistical software are used as benchmark method.It can be seen from the results that the two stochastic gradient descent method greatly reduce the execution time compared to the conventional method,and thus are more suitable for large-scale data sets.Specifically,under the three parameter selection methods,the explicit stochastic gradient descent method exhibits instability,so the parameters need to be carefully selected in the actual application.In contrast,the implicit stochastic gradient descent method is very stable under the three parameter selection methods.Therefore,this paper suggests that the implicit stochastic gradient descent method is better and deserves further research and attention.
Keywords/Search Tags:Stochastic approximation, Implicit updates, Stochastic gradient descent, Large data sets
PDF Full Text Request
Related items