
content="text/html; charset=iso88591">
content="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">
VECFEM3 Reference Manual: LSOLPP
Type: FORTRAN routine
NAME
LSOLPP  LINSOL: Solves a linear system of equations MAT*x
= b
SYNOPSIS
 call LSOLPP(lmat,lprec,liprec,lindex,l,ldw,liw,nproc,
 lsym,ia1,info,MAT,prec,x,b,dw,iprec,index,iw,ilin, lmatbk,ptrinf,tids,epslin,iconv,ierr)
 integer
 lmat,lprec,liprec,lindex,l,ldw,liw,nproc,ia1,iconv, ierr
 integer
 index(lindex),info(ia1,7),ilin(50),iprec(liprec), iw(liw),lmatbk(nproc),ptrinf(12,nproc),
tids(nproc)
 double precision
 MAT(lmat),prec(lprec),dw(ldw),x(l),b(l),epslin
 logical
 lsym
PURPOSE
LSOLPP is a routine to solve a linear system MAT*x = b
with a large and sparse l x l real matrix MAT
and a right hand side b of length l.
Different solution methods of the generalised CGmethod are used,
which can be selected by the user. To accelerate the iteration
and to get a better stability, the matrix MAT is
normalised by a diagonal matrix and/or precondition from the lefthand
side and/or the right hand side. MAT is handed over in
a sparse matrix storage scheme.
ARGUMENTS
If a parameter has the attribute 'local', the parameter values
can vary on different processors. If a parameter has the
attribute 'global', it must have the same value on all processors.
The label *PAR means that the parameter only has to be
declared, but does not have to be set (initialised) on NONPARALLEL
computers. The label **PAR means that the parameter
only has to be declared, but does not have to be set, if nproc is
set (initialised) to 1 (on NONPARALLEL computers nproc is
internally set to 1).
 lmat integer, scalar, input, local
 Length of (local) matrix array MAT.
 lprec integer, scalar, input, local
 Length of preconditioning matrix prec. lprec
must be greater or equal 5*l.
 lindex integer, scalar, input, local
 Length of the integer array index.
 liprec integer, scalar, input, local
 Length of the integer array iprec. liprec
must be greater or equal 1.
 l integer, scalar, input, global
 Maximal number of unknowns on a processor (it must hold:
l = max{lmatbk(i),i=1,..,nproc}).
 ldw integer, scalar, input, local
 Length of the real work array dw.
ldw => 
22*l 
for PRES20 and Polyalgorithm 
ldw => 
22*l 
for BICO, QMRSimulator and CG 
ldw => 
22*l 
Otherwise 
 liw integer, scalar, input, local
 Length of the integer work array iw.
liw => 2*nproc 
,if optim<>100 and optim<>200 
liw => max(2*nproc,4*l) 
,if optim==100 or optim==200 
 nproc integer, scalar, input, global
<<< *PAR
 Number of processors (processes), see combgn.
 lsym logical, scalar, input, global
 is .true., if the matrix MAT is symmetric.
 ia1 integer, scalar, input, local
 First dimension of matrix information array info, see
sparse matrix storage scheme.
The parameter must be greater or equal max{ptrinf(12,i),i=1,nproc}
on each processor.
 info integer, array: info(ia1,ia2),
input, local
 Information array for the matrix MAT (ia2 {= 7}
is a constant), see sparse matrix
storage scheme.
 MAT double precision, array: MAT(lmat),
input/output, local
 Matrix array, see sparse matrix
storage scheme.
 prec double precision, array: prec(lprec),
input, local
 Preconditioning matrix.
 x double precision, array: x(l),
input/output, local
 Solution vector.
 b double precision, array: b(l),
input/output, local
 Right hand side.
 dw double precision, array: dw(ldw),
internal, local
 Real work array.
 iprec integer, array: iprec(liprec),
input, local
 Information vector of preconditioning matrix.
 index integer, array: index(lindex),
input, local
 Array of indices allowing indirect access to the matrix
array MAT, see sparse
matrix storage scheme.
 iw integer, array: iw(liw),
internal, local
 Integer work array.
 ilin integer, array: ilin(nilin),
input/output, global
 Information vector (nilin {=50} is a constant).
 (1) = ms, input, global
 Method selection.
ms = 1 
PRES20 
ms = 2 
BICO 
ms = 3 
BiCGSTAB 
ms = 4 
ATPRES 
ms = 5 
CGS 
ms = 6 
QMRSimulator 
ms = 7 
GMERR 
ms = 9 
CGAT for nonsymmetric
matrices 
ms =10 
Polyalgorithm PRES20 >
BICO > ATPRES 
ms =123 
Classical CG for
symmetric matrices. 
 (2) = itmax, input, global
 Maximal number of iteration steps.
 (3) = nmsg, input/output, global
<<< *PAR
 Message counter for the number of communication
units.
 (4) not used
 (5) not used
 (6) = isprec, input/output, global
 indicates, if the matrix MAT is alread
normalised.
isprec = 0 
matrix is not normalised, 
isprec = 1 
matrix is normalised. 
 (7) not used
 (8) = msprec, input, global
 selects the method of preconditioning. (The
preconditioning matrices are stored as onedimensional
arrays in prec and accessed via
information array iprec).
msprec = 0 : 
no preconditioning, 
msprec = 1 : 
preconditioning by (complete)
LU factorisation << in preparation
>> 
 (9) = noprec, input, global
noprec = 0 
normalised system is
considered for checking of stopping
criterion, 
noprec = 1 
original system is
considered for checking of stopping
criterion. 
 (10) = nmvm, output, global
 Number of computed matrixvector multiplications.
 (11) not used
 (12) = lout, input, local
 Unit number of the standard output file (default
is 6).
 (13) = idoku, input, global
 Output control.
idoku = 0 
print only error messages, 
idoku = 1 
print further information,
but no output during iteration, 
idoku = i>1 
output always after i
iteration steps on the processor with
logical process(or) number 1, 
idoku = i<0 
it holds the same as for
idoku>0 with the difference of output
on all processors. 
 (14) not used
 (15) = msnorm, input, global
 selects the method of normalisation. Note: "A"
stands for a usually stored matrix (matrix MAT is
stored as a onedimensional array and accessed
via array info).
msnorm = 0 
no normalisation, 
msnorm = 1 
normalisation by row sum
(prec(i) = sign A_ii/sum abs(A_ij),j=1,l), 
msnorm = 2 
normalisation by main
diagonal elements (prec(i) = 1/A_ii), 
msnorm = 3 
normalisation by square
sum (prec(i) = 1/sqrt(sum A_ij**2,j=1,l)), 
msnorm = 4 
Froboenius normalisation
(prec(i) = A_ii/sum A_ij**2,j=1,l), 
msnorm = 11 
same as 14,
but with square root of preconditioning
matrix from left and right (automatically
selected if the matrix MAT is
symmetric). 
msnorm = 12 
msnorm = 13 
msnorm = 14 
 (16) = startx, input, global
startx <> 4711 
iteration starts from
zero vector, 
startx = 4711 
an initial guess is given
in x. 
 (17) = myproc, input, local
<<< **PAR
 Logical process(or) number.
 (18) = smooth, input, global
smooth = 0 
iteration returns the nonsmoothed
solution, 
smooth <> 0 
iteration returns the
smoothed solution. 
 (19) = misc, input, global
misc = 00000 
Standard LSOLPP 
misc = 00001 
Printing and checking 
as far as possible  of vector terms(you
can set ilin(21)) 
misc = 00010 
Check, if recursion
emerge(only of use on vector computers)
<not yet implemented> 
misc = 00100 
Reading of (distributed)
matrix MAT in LSOLPPFormat (you
must set ilin(22)) 
misc = 01000 
Storing of (distributed)
matrix MAT in LSOLPPFormat (you
must set ilin(23)) 
misc = 10000 
Output of error 
 (20) = optim, input, global
optim = 000 
No optimisation 
optim = 001 
Optimisation of data
structures<not yet implemented> 
optim = 010 
Optimisation of data
structures with printing of statistics
regarding the data structures <not yet
implemented> 
optim = 100 
Safe cache reuse, ie.
arrays MAT and INDEX
do not change 
optim = 200 
Unsafe cache reuse, ie.
arrays MAT and INDEX
do change 
 (21) = vtunit, input, global
 I/O unit for printing of vector terms (should be
greater 20 and less 100  if not set, then
printing on standard output).
 (22) = matin, input, global
 I/O unit for reading of matrix MAT (should
be greater 20 and less 100  if not set, then
LSOLPP stops).
 (23) = matout, input, global
 I/O unit for storing of matrix MAT (should
be greater 20 and less 100  if not set, then
LSOLPP stops).
 lmatbk integer, array: lmatbk(nproc),
input, global <<< **PAR
 lmatbk(i) returns the number of unknowns on
process(or) i, see sparse
matrix storage scheme.
 ptrinf integer, array: ptrinf(ntyp+1,nproc),
input, local
 ptrinf(ityp,i)+1 points to the first entry
within the array info pointing to vector terms
of storage pattern ityp in block i; the array info
must be sorted by storage patterns (ntyp {=11}
is a constant) and by blocks, see sparse matrix storage scheme.
 tids integer, array: tids(nproc),
input, global <<< **PAR
 tids defines the mapping of the logical
process(or) numbers to the physical process ID's, see combgn.
 iconv integer, output, global
 iconv indicates the behaviour of LSOLPP.
iconv = 1 
convergence, 
iconv = 2 
iteration reached the maximal
number of matrixvectormultiplications (declared
in ilin(2)), 
iconv = 3 
divergence, 
iconv = 4 
matrix is not suitable for LSOLPP,
e.g. nonregular matrix, 
iconv = 99 
fatal error. 
 ierr integer, output, local
 ierr is a error number with information about
the error.
ierr =ijk 
i = 0,..,9 indicates the level of
routines, on which the error occurs, 

j = 0 means: wrong parameters, 

j = 1 means: error during
normalisation, 

j = 2 means: error during
iteration process, 

j = 3 means: error during
communication, 

k is a general error counter with
0<k<100. 
 epslin double precision, input, global
 epslin is the relative stopping factor. If (lcond1
.and. lcond2) then the iteration is stopped. If (lcond1 .and.
.not. lcond2) then a new stopping criterion is computed.
lcond1=((precondition) smoothed residuum_{2}
<= epslin * (precondition) r.h.s. _{2})
and
noprec=1 
Lcond2=(original smoothed
residuum_{max}<=epslin *original r.h.s.
_{max}) 
noprec=0 
Lcond2=((precondition) smoothed
residuum_{max}<=epslin*(preconitioned)
r.h.s. _{max}) 
EXAMPLE
N/A
METHOD
The iterative methods, smoothing, and normalisation are
described in the LINSOL user's guide [10].
Normalization
At the beginning the matrix MAT is normalised from
the left by a diagonal matrix prec(1..l), if MAT
is not normalised (ilin(6)=isprec = 0) and a
normalisation method is chosen (ilin(15)=msnorm
<> 0). If MAT is symmetric or a symmetric
normalisation is selected, it is additionally normalised from the
right to preserve symmetry. The user can choose different methods
of normalisation. We recommend the normalisation by row sum (ilin(15)
= 1), because in the most cases it is the best method with regard
to the performed matrixvector multiplications. Returning from LSOLPP
the array MAT is normalised (ilin(6)=isprec
= 1). This is important to remember, if LSOLPP
is recalled or the matrix should be used in other contexts after
calling LSOLPP!
Smoothing
The residual of a generalised CGmethod can oscillate strongly.
Therefore both the residual and the approximated solution are
smoothed in every iteration step to ensure that the Euclidean
norm of the residual does not increase. By setting the parameter ilin(18)
to zero it is possible to obtain the original solution instead of
the smoothed one, but the iteration process is always controlled
by the smoothed residual.
Nonsymmetric
For nonsymmetric problems, LSOLPP provides seven different
solution algorithms. These algorithms are implemented both with
smoothing and without smoothing. Thus we get from seven
implemented algorithms thirteen methods. In the sequel the most
popular and "wellknown" ten basic methods, one errorminimising
method and the polyalgorithm are described.
 1. PRES20 [1] (PseudoRESidual 20),
ilin(1)=1, ilin(18)<>0
 For sufficient diagonal dominance of the matrix this is a
quickly converging method. If the norm of the residual
decreases only in the first iteration steps, then choose
one of the other methods. The residuals decrease
monotonically.
 2a. BICO [1] (BICOnjugate gradients
smoothed), ilin(1)=2, ilin(18)<>0
 BICO is more robust than PRES20 and converges quickly
even if the matrix is not diagonally dominant, but BICO
uses two matrixvector multiplications (MVM) by both the
matrix MAT and its transposed matrix MAT^T
per iteration step. The residuals decrease monotonically.
 2b. BCG [2] (BiConjugate Gradients),
ilin(1)=2, ilin(18)=0
 BCG is more robust than PRES20 and converges quickly even
if the matrix is not diagonally dominant, but BICO uses
two matrixvector multiplications (MVM) by both the
matrix MAT and its transposed matrix MAT^T
per iteration step. The residuals oscillate heavily.
Therefore, use BICO or QMR.
 3a. BiCGSTAB smoothed [3,8]
, ilin(1)=3, ilin(18)<>0
 The BiConjugate Gradient STABilized (BiCGSTAB) method
is a modification of CGS that should be more stable. The
residuals decrease monotonically.
 3b. BiCGSTAB [3] , ilin(1)=3, ilin(18)=0
 This is a modification of CGS that should be more stable.
 4a. ATPRES = CGNR [4] , ilin(1)=4,
ilin(18)<>0
 The A_TransposedPseudo_RESidual (ATPRES) method is
equivalent to the CG Normal equations Residualminimising
(CGNR) method and should only be used if one of the
methods BICO, CGS or BiCGSTAB does not converge. It may
converge slowly but it is very robust. The residuals
decrease monotonically.
 4b. CGNE (CG Normal eq. Errormin.) [5],
ilin(1)=4, ilin(18)=0
 CGNE should only be used if one of the methods BICO, CGS
or BiCGSTAB does not converge. It may converge slowly
but it is very robust. The errors decrease monotonically.
 5a. CGS (Conjugate Grad. Squared) smoothed, ilin(1)=5,
ilin(18)<>0
 The smoothed CGS [6,8]
is as robust as BICO. An advantage of CGS compared to
BICO is that CGS only uses matrixvector multiplications
by MAT and not by the transposed matrix MAT^T.
Therefore it could run a little better (: It's getting
better all the time :> on parallel computers. The
residuals decrease monotonically.
 5b. CGS (Conjugate Gradient Squared) [6],
ilin(1)=5, ilin(18)=0
 CGS is as robust as BICO. An advantage of CGS compared to
BICO is that CGS only uses matrixvector multiplications
by MAT and not by the transposed matrix MAT^T.
Therefore it could run a little better (: It's getting
better all the time :> on parallel computers.
 6. QMR simplified [7,9],
ilin(1)=6, ilin(18)<>0
 The QuasiMinimal Residual (QMR) method behaves similar
as BICO. In this implementation the lookahead process
for avoiding breakdowns is omitted.
 7. GMERR (General Minimal ERRor), ilin(1)=7
 GMERR is the generalisation for arbitrary matrices of
Fridman's algorithm. GMERR should be competitive with
GMRES for symmetric matrices.
 9. CGAT (CG with matrix A*A^T), ilin(1)=9
 LSOLPP computes the linear system (MAT)(MAT^T)x=b;
you have only to hand over the matrix MAT to
LSOLPP. The matrix MAT must not be symmetric.
The matrix (MAT)(MAT^T) then is
symmetric; thus LSOLPP chooses the CG method as solution
process.
 10. Polyalgorithm [1]
 The Polyalgorithm tries to exploit the advantages of the
methods PRES20, BICO and ATPRES. It starts with the most
efficient  but least robust  method PRES20. If PRES20
falls asleep, it switches to BICO. If BICO does not
converge fast enough, the Polyalgorithm changes to the
"emergency exit" ATPRES. The Polyalgorithm
should be used if no detailed knowledge of an optimal
iterative method exists. In 95% of all cases the
Polyalgorithm is as fast as the best method in the set {PRES20,BICO,ATPRES}.
.RE 7 If you use the abovementioned methods, the matrix
must not be stored in the symmetric storage scheme.
Symmetric
The matrix MAT has to be given in the symmetric
storage scheme. This reduces both the amount of workspace and CPUtime
needed. You again get two different methods, if you choose
smoothing and no smoothing respectively. For symmetric matrices
both CG methods are faster and more robust than the above
mentioned methods.
 123a. CG smoothed = GMRES, ilin(1)=123, ilin(18)<>0
 The smoothed Conjugate Gradient (CG) method is
mathematically equivalent to the GMRES method.
 123b. Classical CG, ilin(1)=123, ilin(18)=0
REFERENCES
 [1] W. Schoenauer, Scientific Computing
on Vector Computers, NorthHolland, 1987, Amsterdam, New
York, Oxford, Tokyo.
 [2] R. Fletcher, Conjugate Gradient
Methods for Indefinite Systems, Proc. Dundee Biennial
Conf. on Num. Anal., 1975, ed. G.A. Watson, pp. 7389,
SpringerVerlag.
 [3] H.A. van der Vorst, BICGSTAB: A
Fast and Smoothly Converging Variant of BICG for the
Solution of Nonsymmetric Linear Systems, SIAM J. Sci.
Stat. Comp., 1992, Vol. 13, No. 2, pp. 631644.
 [4] W. Schoenauer, M. Schlichte, R.
Weiss, Wrong Ways and Promising Ways towards Robust and
Efficient Iterative Linear Solvers, Advances in Computer
Methods for Partial Differential Equations VI, Publ.
IMACS, 1987, pp. 714.
 [5] E.J. Craig, The NStep Iteration
Procedures, Math. Phys., 1955, Vol. 34, pp. 6473.
 [6] P. Sonneveld, CGS, A fast Lanczostype
Solver for nonsymmetric linear Systems, SIAM J. Sci.
Stat. Comp., 1989, Vol. 10, No.1, pp. 3652.
 [7] R.W. Freund and N.M. Nachtigal, QMR:
a QuasiMinimal Residual Method for NonHermitian Linear
Systems, Numer. Math., 1991, Vol. 60, pp. 315339.,
 [8] R. Weiss, Properties of Generalised
Conjugate Gradient Methods, Num. Lin. Alg. with Appl.,
1994, Vol. 1, No. 1, pp. 4563.
 [9] L. Zhou and H.F. Walker, Residual
Smoothing Techniques for Iterative Methods, SIAM J. Sci.
Comput., 1994, Vol. 15, No. 2, pp. 297312.
 [10] R. Weiss, H. Haefner, W.
Schoenauer: LINSOL (LINear
SOLver)  Description and User's Guide for the
Parallelized Version (DRAFT Version 0.96); University
of Karlsruhe; Computing Center; Internal Report 61/95;
1995.
Authors
Program by L. Grosz, H. Haefner, C. Roll, P.
Sternecker, R. Weiss 198996. Please contact L.Grosz.
By L. Grosz, Auckland, 4 June 2000.
