Next: New Levenberg-Marquardt Training Algorithms
Up: New Random Neural Network
Previous: Introduction
  Contents
  Index
Gradient Descent Training Algorithm for RNN
Gelenbe's RNN training algorithm [47] proceeds as
follows. Recall that and represent the rates at
which neuron sends respectively excitation or inhibition spikes to
neuron . Let denote the total number of neurons in the
network. The adjustable network parameters are the two matrices
and
, both of
which have elements101. These parameters should be determined by the training algorithm from the training examples (a set of input-output pairs). The set of inputs is denoted by
. Each of the inputs ,
, consists of a set of excitation-inhibition pairs
representing the signal flow entering each neuron from outside the network. Thus, for each set of inputs, we have
and
. The set of outputs is denoted by
, where
; the elements
represent the desired output of each neuron.
The tranining technique must adjust the parameters
in order to minimize a cost function , given by
|
(101) |
where
and are the neuron's
actual and desired output for the input-output pair
respectively. The constant is set to zero for the neurons that are
not connected to the outputs of the network. At each successive
input-output pair , the adjustable network parameters
and
, where
need
to be updated. The algorithm used by Gelenbe is the gradient descent
one. Observe that the values in the matrices and
must not be negative (since their elements are
transition rates).
The rule for the weights update is as follows:
|
(102) |
Here, (it is called the learning rate),
and is the output of neuron calculated from the input and from the equations by setting
, where can be either or .
Recall that
|
(103) |
where
|
(104) |
and
|
(105) |
Using Eqn. 10.3 to compute
, we have:
|
(106) |
|
(107) |
where
and
Let
, and define the matrix:
We can now write
|
(108) |
where the elements of the -vectors
,
are given by:
It can be observed that Eqn. 10.8 can be rewritten as follows:
|
(109) |
where is the identity matrix.
The training algorithm can be started by first initializing the two matrices
and
randomly. Then, we have to choose the value of . For each successive value , starting from , one must proceed as follows:
- Set the input values to
.
- Solve the system of nonlinear equations 10.3 and 10.4.
- Solve the system of linear equations 10.9 with the results of step 2.
- Using Eqn. 10.2 and the results of Eqn. 10.3 and Eqn. 10.4, update the matrices
and
. It should be noted that the values in the two matrices
and
should not be negative, as stated before. Thus, in any step of the algorithm, if the iteration yields a negative value of a term, we consider two possibilities:
- (a)
- Set the negative value to zero and stop the iteration for this term at this step . In the next step , iterate on this term with the same rule starting from its current zero value.
- (b)
- Go back to the previous value of the term and iterate with a smaller value of .
After the training phase, the network can be used in normal operation (the only needed computations to obtain the outputs are those given by equations 10.3 and 10.4).
Next: New Levenberg-Marquardt Training Algorithms
Up: New Random Neural Network
Previous: Introduction
  Contents
  Index
Samir Mohamed
2003-01-08