GNU-Darwin Web
content="text/html; charset=iso-8859-1"> content="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">

# VECFEM3 Reference Manual: LSOLPP

Type: FORTRAN routine

## NAME

LSOLPP - LINSOL: Solves a linear system of equations MAT*x = b

## SYNOPSIS

call LSOLPP(lmat,lprec,liprec,lindex,l,ldw,liw,nproc,
lsym,ia1,info,MAT,prec,x,b,dw,iprec,index,iw,ilin, lmatbk,ptrinf,tids,epslin,iconv,ierr)
integer
lmat,lprec,liprec,lindex,l,ldw,liw,nproc,ia1,iconv, ierr
integer
index(lindex),info(ia1,7),ilin(50),iprec(liprec), iw(liw),lmatbk(nproc),ptrinf(12,nproc), tids(nproc)
double precision
MAT(lmat),prec(lprec),dw(ldw),x(l),b(l),epslin
logical
lsym

## PURPOSE

LSOLPP is a routine to solve a linear system MAT*x = b with a large and sparse l x l real matrix MAT and a right hand side b of length l. Different solution methods of the generalised CG-method are used, which can be selected by the user. To accelerate the iteration and to get a better stability, the matrix MAT is normalised by a diagonal matrix and/or precondition from the left-hand side and/or the right hand side. MAT is handed over in a sparse matrix storage scheme.

## ARGUMENTS

If a parameter has the attribute 'local', the parameter values can vary on different processors. If a parameter has the attribute 'global', it must have the same value on all processors. The label *PAR means that the parameter only has to be declared, but does not have to be set (initialised) on NON-PARALLEL computers. The label **PAR means that the parameter only has to be declared, but does not have to be set, if nproc is set (initialised) to 1 (on NON-PARALLEL computers nproc is internally set to 1).

lmat integer, scalar, input, local
Length of (local) matrix array MAT.
lprec integer, scalar, input, local
Length of preconditioning matrix prec. lprec must be greater or equal 5*l.
lindex integer, scalar, input, local
Length of the integer array index.
liprec integer, scalar, input, local
Length of the integer array iprec. liprec must be greater or equal 1.
l integer, scalar, input, global
Maximal number of unknowns on a processor (it must hold: l = max{lmatbk(i),i=1,..,nproc}).
ldw integer, scalar, input, local
Length of the real work array dw.
 ldw => 22*l for PRES20 and Polyalgorithm ldw => 22*l for BICO, QMR-Simulator and CG ldw => 22*l Otherwise
liw integer, scalar, input, local
Length of the integer work array iw.
 liw => 2*nproc ,if optim<>100 and optim<>200 liw => max(2*nproc,4*l) ,if optim==100 or optim==200
nproc integer, scalar, input, global <<< *PAR
Number of processors (processes), see combgn.
lsym logical, scalar, input, global
is .true., if the matrix MAT is symmetric.
ia1 integer, scalar, input, local
First dimension of matrix information array info, see sparse matrix storage scheme. The parameter must be greater or equal max{ptrinf(12,i),i=1,nproc} on each processor.
info integer, array: info(ia1,ia2), input, local
Information array for the matrix MAT (ia2 {= 7} is a constant), see sparse matrix storage scheme.
MAT double precision, array: MAT(lmat), input/output, local
Matrix array, see sparse matrix storage scheme.
prec double precision, array: prec(lprec), input, local
Preconditioning matrix.
x double precision, array: x(l), input/output, local
Solution vector.
b double precision, array: b(l), input/output, local
Right hand side.
dw double precision, array: dw(ldw), internal, local
Real work array.
iprec integer, array: iprec(liprec), input, local
Information vector of preconditioning matrix.
index integer, array: index(lindex), input, local
Array of indices allowing indirect access to the matrix array MAT, see sparse matrix storage scheme.
iw integer, array: iw(liw), internal, local
Integer work array.
ilin integer, array: ilin(nilin), input/output, global
Information vector (nilin {=50} is a constant).
(1) = ms, input, global
Method selection.
 ms = 1 PRES20 ms = 2 BICO ms = 3 Bi-CGSTAB ms = 4 AT-PRES ms = 5 CGS ms = 6 QMR-Simulator ms = 7 GMERR ms = 9 CG-AT for non-symmetric matrices ms =10 Polyalgorithm PRES20 -> BICO -> AT-PRES ms =123 Classical CG for symmetric matrices.
(2) = itmax, input, global
Maximal number of iteration steps.
(3) = nmsg, input/output, global <<< *PAR
Message counter for the number of communication units.
(4) not used
(5) not used
(6) = isprec, input/output, global
indicates, if the matrix MAT is alread normalised.
 isprec = 0 matrix is not normalised, isprec = 1 matrix is normalised.
(7) not used
(8) = msprec, input, global
selects the method of preconditioning. (The preconditioning matrices are stored as one-dimensional arrays in prec and accessed via information array iprec).
 msprec = 0 : no preconditioning, msprec = 1 : preconditioning by (complete) LU factorisation << in preparation >>
(9) = noprec, input, global
 noprec = 0 normalised system is considered for checking of stopping criterion, noprec = 1 original system is considered for checking of stopping criterion.
(10) = nmvm, output, global
Number of computed matrix-vector multiplications.
(11) not used
(12) = lout, input, local
Unit number of the standard output file (default is 6).
(13) = idoku, input, global
Output control.
 idoku = 0 print only error messages, idoku = 1 print further information, but no output during iteration, idoku = i>1 output always after i iteration steps on the processor with logical process(or) number 1, idoku = -i<0 it holds the same as for idoku>0 with the difference of output on all processors.
(14) not used
(15) = msnorm, input, global
selects the method of normalisation. Note: "A" stands for a usually stored matrix (matrix MAT is stored as a one-dimensional array and accessed via array info).
 msnorm = 0 no normalisation, msnorm = 1 normalisation by row sum (prec(i) = sign A_ii/sum abs(A_ij),j=1,l), msnorm = 2 normalisation by main diagonal elements (prec(i) = 1/A_ii), msnorm = 3 normalisation by square sum (prec(i) = 1/sqrt(sum A_ij**2,j=1,l)), msnorm = 4 Froboenius normalisation (prec(i) = A_ii/sum A_ij**2,j=1,l), msnorm = 11 same as 1-4, but with square root of preconditioning matrix from left and right (automatically selected if the matrix MAT is symmetric). msnorm = 12 msnorm = 13 msnorm = 14
(16) = startx, input, global
 startx <> 4711 iteration starts from zero vector, startx = 4711 an initial guess is given in x.
(17) = myproc, input, local <<< **PAR
Logical process(or) number.
(18) = smooth, input, global
 smooth = 0 iteration returns the non-smoothed solution, smooth <> 0 iteration returns the smoothed solution.
(19) = misc, input, global
 misc = 00000 Standard LSOLPP misc = 00001 Printing and checking - as far as possible - of vector terms(you can set ilin(21)) misc = 00010 Check, if recursion emerge(only of use on vector computers) misc = 00100 Reading of (distributed) matrix MAT in LSOLPP-Format (you must set ilin(22)) misc = 01000 Storing of (distributed) matrix MAT in LSOLPP-Format (you must set ilin(23)) misc = 10000 Output of error
(20) = optim, input, global
 optim = 000 No optimisation optim = 001 Optimisation of data structures optim = 010 Optimisation of data structures with printing of statistics regarding the data structures optim = 100 Safe cache reuse, ie. arrays MAT and INDEX do not change optim = 200 Unsafe cache reuse, ie. arrays MAT and INDEX do change
(21) = vtunit, input, global
I/O unit for printing of vector terms (should be greater 20 and less 100 - if not set, then printing on standard output).
(22) = matin, input, global
I/O unit for reading of matrix MAT (should be greater 20 and less 100 - if not set, then LSOLPP stops).
(23) = matout, input, global
I/O unit for storing of matrix MAT (should be greater 20 and less 100 - if not set, then LSOLPP stops).
lmatbk integer, array: lmatbk(nproc), input, global <<< **PAR
lmatbk(i) returns the number of unknowns on process(or) i, see sparse matrix storage scheme.
ptrinf integer, array: ptrinf(ntyp+1,nproc), input, local
ptrinf(ityp,i)+1 points to the first entry within the array info pointing to vector terms of storage pattern ityp in block i; the array info must be sorted by storage patterns (ntyp {=11} is a constant) and by blocks, see sparse matrix storage scheme.
tids integer, array: tids(nproc), input, global <<< **PAR
tids defines the mapping of the logical process(or) numbers to the physical process ID's, see combgn.
iconv integer, output, global
iconv indicates the behaviour of LSOLPP.
 iconv = 1 convergence, iconv = 2 iteration reached the maximal number of matrix-vector-multiplications (declared in ilin(2)), iconv = 3 divergence, iconv = 4 matrix is not suitable for LSOLPP, e.g. non-regular matrix, iconv = 99 fatal error.
ierr integer, output, local
ierr is a error number with information about the error.
 ierr =ijk i = 0,..,9 indicates the level of routines, on which the error occurs, j = 0 means: wrong parameters, j = 1 means: error during normalisation, j = 2 means: error during iteration process, j = 3 means: error during communication, k is a general error counter with 0
epslin double precision, input, global
epslin is the relative stopping factor. If (lcond1 .and. lcond2) then the iteration is stopped. If (lcond1 .and. .not. lcond2) then a new stopping criterion is computed.

lcond1=(||(precondition) smoothed residuum||2 <= epslin *|| (precondition) r.h.s. ||2)

and

 noprec=1 Lcond2=(||original smoothed residuum||max<=epslin *||original r.h.s. ||max) noprec=0 Lcond2=(||(precondition) smoothed residuum||max<=epslin*||(preconitioned) r.h.s. ||max)

N/A

## METHOD

The iterative methods, smoothing, and normalisation are described in the LINSOL user's guide [10].

### Normalization

At the beginning the matrix MAT is normalised from the left by a diagonal matrix prec(1..l), if MAT is not normalised (ilin(6)=isprec = 0) and a normalisation method is chosen (ilin(15)=msnorm <> 0). If MAT is symmetric or a symmetric normalisation is selected, it is additionally normalised from the right to preserve symmetry. The user can choose different methods of normalisation. We recommend the normalisation by row sum (ilin(15) = 1), because in the most cases it is the best method with regard to the performed matrix-vector multiplications. Returning from LSOLPP the array MAT is normalised (ilin(6)=isprec = 1). This is important to remember, if LSOLPP is recalled or the matrix should be used in other contexts after calling LSOLPP!

### Smoothing

The residual of a generalised CG-method can oscillate strongly. Therefore both the residual and the approximated solution are smoothed in every iteration step to ensure that the Euclidean norm of the residual does not increase. By setting the parameter ilin(18) to zero it is possible to obtain the original solution instead of the smoothed one, but the iteration process is always controlled by the smoothed residual.

### Non-symmetric

For non-symmetric problems, LSOLPP provides seven different solution algorithms. These algorithms are implemented both with smoothing and without smoothing. Thus we get from seven implemented algorithms thirteen methods. In the sequel the most popular and "well-known" ten basic methods, one error-minimising method and the polyalgorithm are described.

1. PRES20 [1] (Pseudo-RESidual 20), ilin(1)=1, ilin(18)<>0
For sufficient diagonal dominance of the matrix this is a quickly converging method. If the norm of the residual decreases only in the first iteration steps, then choose one of the other methods. The residuals decrease monotonically.
2a. BICO [1] (BICOnjugate gradients smoothed), ilin(1)=2, ilin(18)<>0
BICO is more robust than PRES20 and converges quickly even if the matrix is not diagonally dominant, but BICO uses two matrix-vector multiplications (MVM) by both the matrix MAT and its transposed matrix MAT^T per iteration step. The residuals decrease monotonically.
2b. BCG [2] (BiConjugate Gradients), ilin(1)=2, ilin(18)=0
BCG is more robust than PRES20 and converges quickly even if the matrix is not diagonally dominant, but BICO uses two matrix-vector multiplications (MVM) by both the matrix MAT and its transposed matrix MAT^T per iteration step. The residuals oscillate heavily. Therefore, use BICO or QMR.
3a. Bi-CGSTAB smoothed [3,8] , ilin(1)=3, ilin(18)<>0
The Bi-Conjugate Gradient STABilized (Bi-CGSTAB) method is a modification of CGS that should be more stable. The residuals decrease monotonically.
3b. Bi-CGSTAB [3] , ilin(1)=3, ilin(18)=0
This is a modification of CGS that should be more stable.
4a. AT-PRES = CGNR [4] , ilin(1)=4, ilin(18)<>0
The A_Transposed-Pseudo_RESidual (AT-PRES) method is equivalent to the CG Normal equations Residual-minimising (CGNR) method and should only be used if one of the methods BICO, CGS or Bi-CGSTAB does not converge. It may converge slowly but it is very robust. The residuals decrease monotonically.
4b. CGNE (CG Normal eq. Error-min.) [5], ilin(1)=4, ilin(18)=0
CGNE should only be used if one of the methods BICO, CGS or Bi-CGSTAB does not converge. It may converge slowly but it is very robust. The errors decrease monotonically.
5a. CGS (Conjugate Grad. Squared) smoothed, ilin(1)=5, ilin(18)<>0
The smoothed CGS [6,8] is as robust as BICO. An advantage of CGS compared to BICO is that CGS only uses matrix-vector multiplications by MAT and not by the transposed matrix MAT^T. Therefore it could run a little better (-: It's getting better all the time :-> on parallel computers. The residuals decrease monotonically.
5b. CGS (Conjugate Gradient Squared) [6], ilin(1)=5, ilin(18)=0
CGS is as robust as BICO. An advantage of CGS compared to BICO is that CGS only uses matrix-vector multiplications by MAT and not by the transposed matrix MAT^T. Therefore it could run a little better (-: It's getting better all the time :-> on parallel computers.
6. QMR simplified [7,9], ilin(1)=6, ilin(18)<>0
The Quasi-Minimal Residual (QMR) method behaves similar as BICO. In this implementation the look-ahead process for avoiding breakdowns is omitted.
7. GMERR (General Minimal ERRor), ilin(1)=7
GMERR is the generalisation for arbitrary matrices of Fridman's algorithm. GMERR should be competitive with GMRES for symmetric matrices.
9. CG-AT (CG with matrix A*A^T), ilin(1)=9
LSOLPP computes the linear system (MAT)(MAT^T)x=b; you have only to hand over the matrix MAT to LSOLPP. The matrix MAT must not be symmetric. The matrix (MAT)(MAT^T) then is symmetric; thus LSOLPP chooses the CG method as solution process.
10. Polyalgorithm [1]
The Polyalgorithm tries to exploit the advantages of the methods PRES20, BICO and AT-PRES. It starts with the most efficient - but least robust - method PRES20. If PRES20 falls asleep, it switches to BICO. If BICO does not converge fast enough, the Polyalgorithm changes to the "emergency exit" AT-PRES. The Polyalgorithm should be used if no detailed knowledge of an optimal iterative method exists. In 95% of all cases the Polyalgorithm is as fast as the best method in the set {PRES20,BICO,AT-PRES}. .RE -7 If you use the above-mentioned methods, the matrix must not be stored in the symmetric storage scheme.

### Symmetric

The matrix MAT has to be given in the symmetric storage scheme. This reduces both the amount of workspace and CPU-time needed. You again get two different methods, if you choose smoothing and no smoothing respectively. For symmetric matrices both CG methods are faster and more robust than the above mentioned methods.

123a. CG smoothed = GMRES, ilin(1)=123, ilin(18)<>0
The smoothed Conjugate Gradient (CG) method is mathematically equivalent to the GMRES method.
123b. Classical CG, ilin(1)=123, ilin(18)=0

## REFERENCES

[1] W. Schoenauer, Scientific Computing on Vector Computers, North-Holland, 1987, Amsterdam, New York, Oxford, Tokyo.
[2] R. Fletcher, Conjugate Gradient Methods for Indefinite Systems, Proc. Dundee Biennial Conf. on Num. Anal., 1975, ed. G.A. Watson, pp. 73-89, Springer-Verlag.
[3] H.A. van der Vorst, BI-CGSTAB: A Fast and Smoothly Converging Variant of BI-CG for the Solution of Non-symmetric Linear Systems, SIAM J. Sci. Stat. Comp., 1992, Vol. 13, No. 2, pp. 631-644.
[4] W. Schoenauer, M. Schlichte, R. Weiss, Wrong Ways and Promising Ways towards Robust and Efficient Iterative Linear Solvers, Advances in Computer Methods for Partial Differential Equations VI, Publ. IMACS, 1987, pp. 7-14.
[5] E.J. Craig, The N-Step Iteration Procedures, Math. Phys., 1955, Vol. 34, pp. 64--73.
[6] P. Sonneveld, CGS, A fast Lanczos-type Solver for non-symmetric linear Systems, SIAM J. Sci. Stat. Comp., 1989, Vol. 10, No.1, pp. 36-52.
[7] R.W. Freund and N.M. Nachtigal, QMR: a Quasi-Minimal Residual Method for Non-Hermitian Linear Systems, Numer. Math., 1991, Vol. 60, pp. 315--339.,
[8] R. Weiss, Properties of Generalised Conjugate Gradient Methods, Num. Lin. Alg. with Appl., 1994, Vol. 1, No. 1, pp. 45--63.
[9] L. Zhou and H.F. Walker, Residual Smoothing Techniques for Iterative Methods, SIAM J. Sci. Comput., 1994, Vol. 15, No. 2, pp. 297--312.
[10] R. Weiss, H. Haefner, W. Schoenauer: LINSOL (LINear SOLver) - Description and User's Guide for the Parallelized Version (DRAFT Version 0.96); University of Karlsruhe; Computing Center; Internal Report 61/95; 1995.