Working Group Matrix Computations and Statistics

First workshop

Geneva, 4-5 May, 2001
 

Programme
Abstracts
Participants
 

Programme

Friday, 4th of May, 2001
------------------------

8h50-9h00   Openings

9h00-9h45   Jocelyne Erhel
            Iterative solvers for large sparse linear systems 
            (.ps file)

9h45-10h30  Stratis Gallopoulos
            Towards effective methods for computing matrix
            pseudospectra (.ps file)  

10h30-11h00 Coffee Break

11h00-11h45 Bernard Philippe 
            Parallel computation of the smallest singular values of a
            matrix (.ps file) 

11h45-12h30 George Moustakides
            Eigen-decomposition of a class of infinite dimensional
            block tridiagonal matrices (.ppt file) 

12h30-14h00 Lunch

14h00-14h45 Lars Eldén
            Solving quadratically constrained least squares problems using 
            a differential-geometric approach (.ps file) 
 
14h45-15h30 Bart De Moor
            Least-squares support vector machines  (.ps file)

15h30-16h00 Coffee Break

16h00-16h45 Hans Bruun Nielsen
            Algorithms and applications of Huber estimation 
            (.ps file) 

16h45-17h30 Roger Payne
            Analysis of variance, general balance and large data sets  
            (.ppt file) 

17h30-18h15 Erricos Kontoghiorghes, Paolo Foschi and Cristian Gatu
            Solving linear model estimation problems (.ps file)  
            
19h00-...   Dinner


Saturday, 5th of May 2001
-------------------------

9h00-9h45   Giorgio Pauletto and Manfred Gilli
            Solving economic and financial problems with parallel
            methods (.ps file)

9h45-10h30  Zahari Zlatev
            Computational challenges in the treatment of large-scale
            environmental models (.ppt file) 

10h30-11h00 Coffee Break

11h00-11h45 Rasmus Bro
            New multi-way models and algorithms for solving blind
            source separation problems  (.pdf file)

11h45-12h30 Discussion

12h30-14h00 Lunch 






Abstracts


Iterative solvers for large sparse linear systems

Jocelyne Erhel
Projet Aladin, IRISA/INRIA-Rennes Campus de Beaulieu, 35042 Rennes
cedex, France


Large sparse linear systems arise in many scientific applications.
Krylov iterative methods require less memory storage than direct
methods, they can even be matrix-free, with no matrix storage, since
the only operation required is matrix-vector product. For symmetric
positive definite matrices, the method of choice is Conjugate
Gradient. For indefinite matrices or unsymmetric matrices, the
situation is far more complicated. Several methods exist, with no best
one.  In all cases, a difficult point is to find an efficient
preconditioning method, which speeds-up the convergence at a low
computational cost.  Examples are given to illustrate the properties
of Krylov methods.  The final example comes from the discretisation of
a 3D biharmonic problem.  It shows that a promising preconditioning
approach is by multigrid or multilevel methods.


-------------------------------------------------------------------

Towards effective methods for computing matrix pseudospectra

Stratis Gallopoulos
Department of Computer Engineering & Informatics Patras 26500, Greece


Given a matrix A, the computation of its pseudospectrum, that is the
locus of eigenvalues of matrices of the form A+E, for E bounded in
norm by some small epsilon, is a far more expensive task than the
computation of characteristics such as the condition number and the
matrix spectrum. As research of the last 15 years has shown, however,
the matrix pseudospectrum provides valuable information that is not
included in the other indicators. So the question is how to compute it
efficiently and how do we build a tool that would facilitate engineers
and scientists to make such analyses?  We will consider this problem
from the point of view of 1) the basic computational kernels, 2)
domain based information, 3) parallelism and 4) the programming
environment and will provide a review of our recent efforts on this
subject.

-------------------------------------------------------------------

Parallel computation of the smallest singular values of a matrix

Bernard Philippe
Projet Aladin, IRISA/INRIA-Rennes Campus de Beaulieu, 35042 Rennes
cedex, France


The smallest singular value of a matrix can be seen as its distance to
singularity. For linear systems, it provides the condition number of
the system to solve. When several singular values are zeros, it is
often necessary to know the numerical rank of the matrix and to build
a basis of the null space. The computation of the smallest singular
value also occurs when determining the pseudospectrum of matrix for
the purpose of analysing the sensitivity of eigenvalues with respect
to matrix perturbations.  There are different ways to compute such
singular values depending on the considered matrices. For dense
matrices, the method of choice is implemented in LAPACK. For large and
sparse matrices, there exist different methods which are based on the
construction of subspaces of small or medium dimensions (Lanczos,
Davidson, Trace Minimization). We shall discuss their respective
advantages as well as their efficiency on parallel computers.



-------------------------------------------------------------------

Eigen-decomposition of a class of infinite dimensional tri-diagonal
matrices

George V. Moustakides
Department of Computer Engineering and Informatics, 
University of Patras, 26 500 Patras, Greece


We consider the eigen-decomposition problem of a special class of
infinite dimensional block tri-diagonal matrices.  Such problems occur
in linearly polarized laser fields when one attempts to analyze the
scattering of free electrons.

We show that by using Discrete Fourier Transform we can reduce the
original infinite dimensional eigen-decomposition problem into a
combination of a linear system of differential equations and a new
eigenvalue-eigenvector problem both being of the size of a single
block.

We consider numerical integration methods for the solution of the
differential equation that respect the special structure of its
solution. Competing methods are compared with respect to their
accuracy and computational complexity.

Finally we propose an FFT based technique that can efficiently compute
the eigenvectors of the original infinite dimensional problem, from
the corresponding eigenvectors of the finite dimensional one.

-------------------------------------------------------------------

Solving quadratically constrained least squares problems using 
a differential-geometric approach

Lars Eldén
Department of Mathematics, Linköping University, 
SE-581 83, Linköping, Sweden
email: laeld@math.liu.se


A quadratically constrained linear least squares problem is usually
solved using a Lagrange multiplier for the constraint and then solving
numerically a nonlinear secular equation for the optimal Lagrange
multiplier. It is well-known that, due to the closeness to a pole for
the secular equation, standard methods for solving the secular
equation can be very slow. The problem can be reformulated as that of
minimizing the residual of the least squares problem on the unit
sphere. Using a differential-geometric approach we formulate Newton's
method on the sphere, and thereby avoid the difficulties associated
with the Lagrange multiplier formulation. This Newton method on the
sphere can be implemented efficiently, and since its convergence is
often quite fast it appears to be superior to the Lagrange multiplier
method. A few numerical examples are given. We also discuss briefly
the extension to orthogonal Procrustes problems.


-------------------------------------------------------------------

Least-squares support vector machines

Bart De Moor
Department Electrical Engineering, Katholieke Universiteit Leuven,
Kasteelpark Arenberg 10 B-3001 Leuven-Heverlee Belgium


LS-SVM is a recently developed parametrization for non-linear
multi-dimensional regression problems.  Data vectors are non-linearly
mapped, via so-called feature vectors, into a high-dimensional space
(possibly infinite), in which the regression problem can be formulated
as a regularized least squares problem. Via a so-called Mercer
condition and by using vectors of Lagrange multipliers, the problem
reduces to solving a square set of linear equations, of the size of
the number of data points (which might be large).  In this talk, we
will develop the main ideas behind LS-SVM and comment on the
computational challenges for large regression problems. We will also
show how in the same framework, the Mercer condition allows for the
formulation of non-linear kernel PCA (principal components analysis)
and non-linear kernel CCA (canonical correlations analysis).

-------------------------------------------------------------------

Algorithms and applications of Huber estimation

Hans Bruun Nielsen
Department of Mathematical Modelling, Technical University of Denmark

Huber's M-estimator was proposed for robust parameter estimation.  
It can be interpreted as a combination of L1- and L2-estimators: A
threshold T distinguishes between small and large residuals,
contributing to the objective function with their square and their
absolute value, respectively.  There results a piecewise quadratic,
convex function.  We present an efficient method for finding a
minimizer of this 'Huber function'.  Further, the algorithm has a
number of applications: If T goes to zero, then the 'Huber solution'
converges piecewise linearly to an L1 solution of the problem, and 
the active set is identified for a strictly positive value of T,
i.e. while it still has a smoothing (beneficial) effect on the
iteration.  This approach can be generalized to an algorithm for
linear programming (LP) problems.  Finally, we show how the Huber
algorithm can be used to solve box constrained quadratic programming
(QP) problems.



-------------------------------------------------------------------
Analysis of variance, general balance and large data sets

Roger Payne
IACR-Rothamsted, Harpenden, Herts, AL5 2JQ, U.K.


General balance is relevant to experimental designs with several block
(or error) terms. The total sum of squares can then be partitioned up
into components known as strata, one for each block term, containing
the sum of squares for the treatment terms estimated between the units
of that stratum and a residual representing the random variability of
those units. The properties of a generally balanced design are that
(1) the block (or error) terms are mutually orthogonal, (2) the
treatment terms are also mutually orthogonal, and (3) the contrasts of
each treatment term all have equal efficiency factors in each of the
strata where they are estimated.

The treatment estimates in the various strata can be calculated very
efficiently using the algorithm of Payne & Wilkinson (1976). This does
not require the formation and inversion of matrices of sums of squares
and products between treatment effects. Instead it involves a sequence
of "sweep" operations (which can be formulated as projection
operators). When treatment effects are estimated in several stratum,
estimates that combine this information can be calculated using the
algorithm of Payne & Tobias (1992).  These two algorithms can handle
models with large numbers of treatments much more efficiently than
conventional methods. However, data sets of this size arising from
recent application areas such as data mining will usually be
unbalanced. Related algorithms exist that can handle unbalanced data
(see e.g.  Hemmerle 1974 and Worthington 1975), and it might be
interesting to see how these compare with conventional methods on
large data sets.

-------------------------------------------------------------------

Solving linear model estimation problems

Erricos Kontoghiorghes, Paolo Foschi
and Cristian Gatu
Institut d'informatique, Universite de Neuchatel, Rue Emile-Argand 11,
Case Postale 2, CH-2007 Neuchatel, Switzerland


Algorithms for solving large-scale linear models are considered.
Numerically stable methods for computing the least-squares estimators
and regression diagnostics of the standard and general linear models,
seemingly unrelated regression and simultaneous equations models are
discussed.  The least-squares computations are based on orthogonal
transformations, in particular the QR decomposition.

-------------------------------------------------------------------

Solving economic and financial problems with parallel methods

Giorgio Pauletto
and Manfred Gilli
Department of Econometrics, University of Geneva, Uni Mail 40 Bd du
Pont d'Arve CH-1211 Genève 4 Switzerland


The use of computational methods in economics and finance has greatly
increased in the recent years. We will present some of the recent
studies carried out in macroeconometric model simulation and financial
options valuation here at the Department of Econometrics of University
of Geneva. We will also give an overview of future applications of
computational intensive methods in economics and finance.

The simulation of large macroeconometric models containing
forward-looking variables can become impractical when using exact
Newton methods. In such cases, Krylov methods provide an interesting
alternative. We also discuss a block preconditioner suitable for the
particular class of models solved and study the parallel solution of
the sparse linear system arising in the Newton algorithm.

Finance is another area of application of intensive computational
methods. We examine the valuation of options on several underlying
assets using finite difference methods. Implicit methods, which have
good convergence and stability properties, can be implemented
efficiently thanks to Krylov solvers. A parallel implementation on a
cluster of PCs is carried out showing that large problems can be
efficiently tackled particularly when a fine grid space is needed.


-------------------------------------------------------------------

Computational challenges in the treatment of large-scale
environmental models

Zahari Zlatev
National Environmental Research Institute Department for Atmospheric
Environment Frederiksborgvej 399, P. O. Box 358 DK-4000 Roskilde,
Denmark


Environmental models are typically described by systems of partial
differential equations (PDEs) that are defined on large space domains.
The number of equations is equal to the number of species that are to
be studied by the model.  The PDE system can be transformed, by using
of splitting and discretization techniques to systems of linear
algebraic equations that are to be treated numerically during many
(typically several thousand) time-steps. The size of these systems
depends on the resolution. High resolution 3-D models, which are
defined on a (480x480x10) grid covering the whole of Europe, contain
more than 80 million equations when a chemical scheme with 35 species
is adopted in the model. Although it is desirable to solve this
problem, it is still not possible to achieve this even when the
fastest available computers are used. This is why some simplifications
are always introduced in order to make the problems tractable
numerically.  In the above example, the problem can be solved by using
2-D versions of the model, by which the computational tasks are,
roughly speaking, reduced by a factor of ten.

Many theoretical and numerical problems have to be solved in an
efficient way in order to improve the existing environmental problems.
Some of these problems will be sketched in this talk.


-------------------------------------------------------------------

New multi-way models and algorithms for solving blind source
separation problems

Rasmus Bro
Chemometrics Group, Dept. of Dairy and Food Science The Royal
Veterinary and Agricultural University Rolighedsvej 30, DK-1958
Frederiksberg C, Denmark


In the sixties and seventies, new models were developed in
psychometrics, for handling data that consists of several sets of
matrices.  Such data can be arranged in a box rather than a table, and
are called multi-way data. Analogously to standard two-way techniques
such as principal component analysis/SVD, methods exist for
decomposing multi-way data. One particular of these methods, namely
the PARAFAC model (PARallel FACtor analysis) is very interesting in
that it provides parameters that are unique up to simple scaling and
permutation.

Having access to two or more matrices which follow the same bilinear
model, only in different proportions, complete identification of the
underlying components in the mixtures can be obtained. This is in
contrast to the problems of rotational indeterminacy in traditional
bilinear modeling and is of practical importance in a number of
disciplines such as psychology, chemistry and DSP. Examples will be
given of different models as well as a discussion of current
algorithms and their problems.