Driven By The Amount Of Bp Neural Network Convergence Analysis

Posted on:2006-03-10

Degree:Master

Type:Thesis

Country:China

Candidate:G F Zheng

Full Text:PDF

GTID:2190360152985572

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

Multilayer feedforward neural networks have been widely used in many applications, for which the back-propagation (BP) is the most popular training algorithm. However, BP has slow convergence and easy to get stuck at local minimum, rending it unsuitable for many 'online' situations. Momentum method is a standard technique to improve its convergence performance.In the conventional back-propagation algorithm with momentum (BPM), the momentum coefficient is typically chosen to be a constant in interval (0,1) [1] and a lot of work has been done on the analysis of the algorithm [14-17]. By making an analogy between the movement of Newtonian particles in a viscous medium and the convergence of this BPM algorithm, Qian provided the conditions for the stability of the algorithm. Besides, Hagiwara and Sato [15, 17] have shown that the momentum mechanism can be derived from a modified cost function, in which the squared errors are exponentially weighted in time. However, with a fixed momentum parameter, the momentum term will have an accelerating effect only if the current downhill gradient of the error function â€”E_Ï‰(k) and the last change in weight â–³Ï‰(k-1) has a similar direction. When the current â€”E_Ï‰(k) is in an opposing direction to the previous update â–³Ï‰(k-1), the momentum may cause the weight to be adjusted up the slope of the error surface instead of down the slope as desired. Since in most cases the error surface is multidimensional and protean, to be more effective, it is necessary that the momentum coefficient should be adaptively varied instead of being kept a fixed one through the entire training procedure [13].In this thesis, our first work is to introduce an efficient BP-Momentum algorithm, where the momentum coefficient is adjusted dynamically based on the information about the current gradient and the weight change of previous step at each cycle of training procedure. When the angle between â€”E_Ï‰(k) and â–³Ï‰(k-1) is less than 90Â°, there is not much change in the direction of the error minimization and the change of the weight is likely moving on a plateau, so we choose the momentum coefficient as positive so as to make the convergence rate accelerated. When the angle between them is more than 90Â°, it implies an abrupt change in the direction of error minimization which is likely to be moving along the wall of a ravine. In this case, to guarantee the descent of the error function, we define the momentum coefficient as zero so that the weight update is moving along the direction of the current opposite gradient. Our second work in this thesis is to prove an important weak convergence result of such a new adaptive BPM algorithm. The monotonicity of the error function in the training iteration is also established. To test the performance of the new algorithm, we apply it to XOR and Parity problem. Simulation results show that this new algorithm outperforms other BP algorithms in reducing the training time while maintaining a high rate of success.

Keywords/Search Tags:

Gradient method, BP algorithm, momentum term, convergence

PDF Full Text Request

Related items

1	A Study On Iteration Algorithm With Momentum Term
2	Convergence Of Gradient Learning Algorithm For Two Kinds Of Feedforward Neural Networks
3	Convergence Of Non-convex Stochastic QHM Algorithm Based On ODE Method
4	Research On Gradient Descent Algorithm And Its Application For A Class Of Stochastic Optimization Problem
5	Research On Gradient Algorithm Of M-tensor Equation
6	Study On Dai-Liao Conjugate Gradient Method And Three-term Conjugate Gradient Method
7	Three-term Conjugate Gradient Method Based On Least Square
8	Modified Algorithm Based On Dai-Kou Three-term Conjugate Gradient Method
9	Research On Two Types Of Three-term Conjugate Gradient Methods
10	Some Three-term Nonlinear Conjugate Gradient Algorithms Involving Special Parameters With Their Applications