PZLASMSUB(3) ScaLAPACK routine of NEC Numeric Library Collection PZLASMSUB(3)
NAME
PZLASMSUB - look for a small subdiagonal element from the bottom of the
matrix that it can safely set to zero
SYNOPSIS
SUBROUTINE PZLASMSUB( A, DESCA, I, L, K, SMLNUM, BUF, LWORK )
INTEGER I, K, L, LWORK
DOUBLE PRECISION SMLNUM
INTEGER DESCA( * )
COMPLEX*16 A( * ), BUF( * )
PURPOSE
PZLASMSUB looks for a small subdiagonal element from the bottom of the
matrix that it can safely set to zero.
Notes
=====
Each global data object is described by an associated description vec-
tor. This vector stores the information required to establish the map-
ping between an object element and its corresponding process and memory
location.
Let A be a generic term for any 2D block cyclicly distributed array.
Such a global array has an associated description vector DESCA. In the
following comments, the character _ should be read as "of the global
array".
NOTATION STORED IN EXPLANATION
--------------- -------------- --------------------------------------
DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
DTYPE_A = 1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
the BLACS process grid A is distribu-
ted over. The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A (global) DESCA( M_ ) The number of rows in the global
array A.
N_A (global) DESCA( N_ ) The number of columns in the global
array A.
MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
the rows of the array.
NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row of the array A is distributed.
CSRC_A (global) DESCA( CSRC_ ) The process column over which the
first column of the array A is
distributed.
LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
array. LLD_A >= MAX(1,LOCr(M_A)).
Let K be the number of rows or columns of a distributed matrix, and
assume that its process grid has dimension p x q.
LOCr( K ) denotes the number of elements of K that a process would
receive if K were distributed over the p processes of its process col-
umn.
Similarly, LOCc( K ) denotes the number of elements of K that a process
would receive if K were distributed over the q processes of its process
row.
The values of LOCr() and LOCc() may be determined via a call to the
ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper
bound for these quantities may be computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
ARGUMENTS
A (global input) COMPLEX*16 array, dimension (DESCA(LLD_),*)
On entry, the Hessenberg matrix whose tridiagonal part is being
scanned. Unchanged on exit.
DESCA (global and local input) INTEGER array of dimension DLEN_.
The array descriptor for the distributed matrix A.
I (global input) INTEGER
The global location of the bottom of the unreduced submatrix of
A. Unchanged on exit.
L (global input) INTEGER
The global location of the top of the unreduced submatrix of A.
Unchanged on exit.
K (global output) INTEGER
On exit, this yields the bottom portion of the unreduced subma-
trix. This will satisfy: L <= M <= I-1.
SMLNUM (global input) DOUBLE PRECISION
On entry, a "small number" for the given matrix. Unchanged on
exit.
BUF (local output) COMPLEX*16 array of size LWORK.
LWORK (global input) INTEGER
On exit, LWORK is the size of the work buffer. This must be at
least 2*Ceil( Ceil( (I-L)/HBL ) / LCM(NPROW,NPCOL) ) Here LCM
is least common multiple, and NPROWxNPCOL is the logical grid
size.
Notes:
======
This routine does a global maximum and must be called by all
processes. This code is basically a parallelization of the
following snip of LAPACK code from ZLAHQR:
Look for a single small subdiagonal element.
DO 20 K = I, L + 1, -1 TST1 = CABS1( H( K-1, K-1 ) ) + CABS1(
H( K, K ) ) IF( TST1.EQ.ZERO ) $ TST1 = ZLANHS( '1', I-
L+1, H( L, L ), LDH, WORK ) IF( CABS1( H( K, K-1 ) ).LE.MAX(
ULP*TST1, SMLNUM ) ) $ GO TO 30 20 CONTINUE 30
CONTINUE
FURTHER DETAILS
Implemented by: M. Fahey, May 28, 1999
ScaLAPACK routine 31 October 2017 PZLASMSUB(3)