Font Size: a A A

Driven By The Amount Of Bp Neural Network Convergence Analysis

Posted on:2006-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:G F ZhengFull Text:PDF
GTID:2190360152985572Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Multilayer feedforward neural networks have been widely used in many applications, for which the back-propagation (BP) is the most popular training algorithm. However, BP has slow convergence and easy to get stuck at local minimum, rending it unsuitable for many 'online' situations. Momentum method is a standard technique to improve its convergence performance.In the conventional back-propagation algorithm with momentum (BPM), the momentum coefficient is typically chosen to be a constant in interval (0,1) [1] and a lot of work has been done on the analysis of the algorithm [14-17]. By making an analogy between the movement of Newtonian particles in a viscous medium and the convergence of this BPM algorithm, Qian provided the conditions for the stability of the algorithm. Besides, Hagiwara and Sato [15, 17] have shown that the momentum mechanism can be derived from a modified cost function, in which the squared errors are exponentially weighted in time. However, with a fixed momentum parameter, the momentum term will have an accelerating effect only if the current downhill gradient of the error function —E_ω(k) and the last change in weight △ω(k-1) has a similar direction. When the current —E_ω(k) is in an opposing direction to the previous update △ω(k-1), the momentum may cause the weight to be adjusted up the slope of the error surface instead of down the slope as desired. Since in most cases the error surface is multidimensional and protean, to be more effective, it is necessary that the momentum coefficient should be adaptively varied instead of being kept a fixed one through the entire training procedure [13].In this thesis, our first work is to introduce an efficient BP-Momentum algorithm, where the momentum coefficient is adjusted dynamically based on the information about the current gradient and the weight change of previous step at each cycle of training procedure. When the angle between —E_ω(k) and △ω(k-1) is less than 90°, there is not much change in the direction of the error minimization and the change of the weight is likely moving on a plateau, so we choose the momentum coefficient as positive so as to make the convergence rate accelerated. When the angle between them is more than 90°, it implies an abrupt change in the direction of error minimization which is likely to be moving along the wall of a ravine. In this case, to guarantee the descent of the error function, we define the momentum coefficient as zero so that the weight update is moving along the direction of the current opposite gradient. Our second work in this thesis is to prove an important weak convergence result of such a new adaptive BPM algorithm. The monotonicity of the error function in the training iteration is also established. To test the performance of the new algorithm, we apply it to XOR and Parity problem. Simulation results show that this new algorithm outperforms other BP algorithms in reducing the training time while maintaining a high rate of success.
Keywords/Search Tags:Gradient method, BP algorithm, momentum term, convergence
PDF Full Text Request
Related items