Next: New Levenberg-Marquardt Training Algorithms
Up: New Random Neural Network
Previous: Introduction
  Contents
  Index
Gradient Descent Training Algorithm for RNN
Gelenbe's RNN training algorithm [47] proceeds as
follows. Recall that
and
represent the rates at
which neuron
sends respectively excitation or inhibition spikes to
neuron
. Let
denote the total number of neurons in the
network. The adjustable network parameters are the two matrices
and
, both of
which have
elements101. These parameters should be determined by the training algorithm from the training examples (a set of
input-output pairs). The set of inputs is denoted by
. Each of the inputs
,
, consists of a set of excitation-inhibition pairs
representing the signal flow entering each neuron from outside the network. Thus, for each set of inputs, we have
and
. The set of outputs is denoted by
, where
; the elements
represent the desired output of each neuron.
The tranining technique must adjust the parameters
in order to minimize a cost function
, given by
 |
(101) |
where
and
are the neuron's
actual and desired output for the input-output pair
respectively. The constant
is set to zero for the neurons that are
not connected to the outputs of the network. At each successive
input-output pair
, the adjustable network parameters
and
, where
need
to be updated. The algorithm used by Gelenbe is the gradient descent
one. Observe that the values in the matrices
and
must not be negative (since their elements are
transition rates).
The rule for the weights update is as follows:
![\begin{displaymath}
\begin{array}{c}
\displaystyle w^{+(k)}_{u,v}=w^{+(k-1)...
...partial \varrho_i / \partial w^-_{u,v}]^{(k)}.
\end{array}
\end{displaymath}](img423.gif) |
(102) |
Here,
(it is called the learning rate),
and
is the output of neuron
calculated from the
input and from the equations by setting
, where
can be either
or
.
Recall that
 |
(103) |
where
 |
(104) |
and
![\begin{displaymath}r_i=\sum_{j=1}^n [w^+_{j,i}+w^-_{j,i}].
\end{displaymath}](img432.gif) |
(105) |
Using Eqn. 10.3 to compute
, we have:
![\begin{displaymath}
\partial \varrho_i/\partial w^+_{u,v} =
\sum_j \left[\par...
...] /D_i\right] -1[u\equiv i] \varrho_i /D_i + \varrho_u /D_i,
\end{displaymath}](img434.gif) |
(106) |
![\begin{displaymath}
\partial \varrho_i/\partial w^-_{u,v} =
\sum_j \left[ \pa...
...ht] -1[u\equiv i] \varrho_i /D_i - \varrho_u \varrho_i /D_i,
\end{displaymath}](img435.gif) |
(107) |
where
and
Let
, and define the
matrix:
We can now write
 |
(108) |
where the elements of the
-vectors
,
are given by:
It can be observed that Eqn. 10.8 can be rewritten as follows:
![\begin{displaymath}
\begin{array}{l}
\partial \mathbf{\varrho} /\partial w^...
...\gamma^- _{u,v} \varrho_u [\mathbf{I-W}]^{-1},
\end{array}
\end{displaymath}](img445.gif) |
(109) |
where
is the
identity matrix.
The training algorithm can be started by first initializing the two matrices
and
randomly. Then, we have to choose the value of
. For each successive value
, starting from
, one must proceed as follows:
- Set the input values to
.
- Solve the system of nonlinear equations 10.3 and 10.4.
- Solve the system of linear equations 10.9 with the results of step 2.
- Using Eqn. 10.2 and the results of Eqn. 10.3 and Eqn. 10.4, update the matrices
and
. It should be noted that the values in the two matrices
and
should not be negative, as stated before. Thus, in any step
of the algorithm, if the iteration yields a negative value of a term, we consider two possibilities:
- (a)
- Set the negative value to zero and stop the iteration for this term at this step
. In the next step
, iterate on this term with the same rule starting from its current zero value.
- (b)
- Go back to the previous value of the term and iterate with a smaller value of
.
After the training phase, the network can be used in normal operation (the only needed computations to obtain the outputs are those given by equations 10.3 and 10.4).
Next: New Levenberg-Marquardt Training Algorithms
Up: New Random Neural Network
Previous: Introduction
  Contents
  Index
Samir Mohamed
2003-01-08