Next: Analytical Formulation of the Up: New Random Neural Network Previous: Gradient Descent Training Algorithm Contents Index

New Levenberg-Marquardt Training Algorithms for RNN

In the gradient descent algorithm, the error function to be minimized should be continuous and differentiable with respect to its parameters. Thus, the gradient of the error with respect to any of the parameters can be analytically computed. Once the gradients have been specified, the most straightforward approach for error minimization is gradient descent, where at each point the update direction of the parameter vector is the opposite of the gradient at this point. This approach, due to its simplicity, has many drawbacks: $\bullet$

the zig-zag behavior,
the difficulty in choosing the value of the learning rate parameter $\eta$ ,
the fact that it is slow, usually requiring a large number of training steps.

To alleviate from these problems, we tried to apply the LM method, which is an approximation to Newton methods, on RNN. The LM technique is known to be the best algorithm for optimization problems applied to ANN [55]. In this Section we are going to introduce the LM method as well as some other versions of it. Recently, there has been an improvement of the traditional LM method, which is referred as the adaptive momentum LM for ANN. We will also present the same algorithm adapted to RNN.

Subsections

Next: Analytical Formulation of the Up: New Random Neural Network Previous: Gradient Descent Training Algorithm Contents Index

Samir Mohamed 2003-01-08