PCHEEVR(3) ScaLAPACK routine of NEC Numeric Library Collection PCHEEVR(3)
NAME
PCHEEVR - computes selected eigenvalues and, optionally, eigenvectors
of a complex Hermitian matrix A distributed in 2D blockcyclic format by
calling the recommended sequence of ScaLAPACK routines
SYNOPSIS
SUBROUTINE PCHEEVR( JOBZ, RANGE, UPLO, N, A, IA, JA, DESCA, VL, VU, IL,
IU, M, NZ, W, Z, IZ, JZ, DESCZ, WORK, LWORK, RWORK,
LRWORK, IWORK, LIWORK, INFO )
CHARACTER JOBZ, RANGE, UPLO
INTEGER IA, IL, INFO, IU, IZ, JA, JZ, LIWORK, LRWORK,
LWORK, M, N, NZ
REAL VL, VU
INTEGER DESCA( * ), DESCZ( * ), IWORK( * )
REAL W( * ), RWORK( * )
COMPLEX A( * ), WORK( * ), Z( * )
PURPOSE
PCHEEVR computes selected eigenvalues and, optionally, eigenvectors of
a complex Hermitian matrix A distributed in 2D blockcyclic format by
calling the recommended sequence of ScaLAPACK routines.
First, the matrix A is reduced to real symmetric tridiagonal form.
Then, the eigenproblem is solved using the parallel MRRR algorithm.
Last, if eigenvectors have been computed, a backtransformation is done.
Upon successful completion, each processor stores a copy of all com-
puted eigenvalues in W. The eigenvector matrix Z is stored in 2D block-
cyclic format distributed over all processors.
Notes
=====
Each global data object is described by an associated description vec-
tor. This vector stores the information required to establish the map-
ping between an object element and its corresponding process and memory
location.
Let A be a generic term for any 2D block cyclicly distributed array.
Such a global array has an associated description vector DESCA.
In the following comments, the character _ should be read as "of the
global array".
NOTATION STORED IN EXPLANATION
--------------- -------------- --------------------------------------
DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case,
DTYPE_A = 1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
the BLACS process grid A is distribu-
ted over. The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A (global) DESCA( M_ ) The number of rows in the global
array A.
N_A (global) DESCA( N_ ) The number of columns in the global
array A.
MB_A (global) DESCA( MB_ ) The blocking factor used to distribute
the rows of the array.
NB_A (global) DESCA( NB_ ) The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row of the array A is distributed.
CSRC_A (global) DESCA( CSRC_ ) The process column over which the
first column of the array A is
distributed.
LLD_A (local) DESCA( LLD_ ) The leading dimension of the local
array. LLD_A >= MAX(1,LOCr(M_A)).
Let K be the number of rows or columns of a distributed matrix, and
assume that its process grid has dimension p x q.
LOCr( K ) denotes the number of elements of K that a process would
receive if K were distributed over the p processes of its process col-
umn.
Similarly, LOCc( K ) denotes the number of elements of K that a process
would receive if K were distributed over the q processes of its process
row.
The values of LOCr() and LOCc() may be determined via a call to the
ScaLAPACK tool function, NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper
bound for these quantities may be computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A
PCHEEVR assumes IEEE 754 standard compliant arithmetic.
ARGUMENTS
JOBZ (global input) CHARACTER*1
Specifies whether or not to compute the eigenvectors:
= 'N': Compute eigenvalues only.
= 'V': Compute eigenvalues and eigenvectors.
RANGE (global input) CHARACTER*1
= 'A': all eigenvalues will be found.
= 'V': all eigenvalues in the interval [VL,VU] will be found.
= 'I': the IL-th through IU-th eigenvalues will be found.
UPLO (global input) CHARACTER*1
Specifies whether the upper or lower triangular part of the
symmetric matrix A is stored:
= 'U': Upper triangular
= 'L': Lower triangular
N (global input) INTEGER
The number of rows and columns of the matrix A. N >= 0
A (local input/workspace) 2D block cyclic COMPLEX array,
global dimension (N, N), local dimension ( LLD_A, LOCc(JA+N-1)
) (see Notes below for more detailed explanation of 2d arrays)
On entry, the symmetric matrix A. If UPLO = 'U', only the
upper triangular part of A is used to define the elements of
the symmetric matrix. If UPLO = 'L', only the lower triangular
part of A is used to define the elements of the symmetric
matrix.
On exit, the lower triangle (if UPLO='L') or the upper triangle
(if UPLO='U') of A, including the diagonal, is destroyed.
IA (global input) INTEGER
A's global row index, which points to the beginning of the sub-
matrix which is to be operated on.
It should be set to 1 when operating on a full matrix.
JA (global input) INTEGER
A's global column index, which points to the beginning of the
submatrix which is to be operated on.
It should be set to 1 when operating on a full matrix.
DESCA (global and local input) INTEGER array of dimension DLEN_.
(The ScaLAPACK descriptor length is DLEN_ = 9.)
The array descriptor for the distributed matrix A.
The descriptor stores details about the 2D block-cyclic stor-
age, see the notes below.
If DESCA is incorrect, PCHEEVR cannot work correctly.
Also note the array alignment requirements specified below
VL (global input) REAL
If RANGE='V', the lower bound of the interval to be searched
for eigenvalues. Not referenced if RANGE = 'A' or 'I'.
VU (global input) REAL
If RANGE='V', the upper bound of the interval to be searched
for eigenvalues. Not referenced if RANGE = 'A' or 'I'.
IL (global input) INTEGER
If RANGE='I', the index (from smallest to largest) of the
smallest eigenvalue to be returned. IL >= 1.
Not referenced if RANGE = 'A'.
IU (global input) INTEGER
If RANGE='I', the index (from smallest to largest) of the
largest eigenvalue to be returned. min(IL,N) <= IU <= N.
Not referenced if RANGE = 'A'.
M (global output) INTEGER
Total number of eigenvalues found. 0 <= M <= N.
NZ (global output) INTEGER
Total number of eigenvectors computed. 0 <= NZ <= M.
The number of columns of Z that are filled.
If JOBZ .NE. 'V', NZ is not referenced.
If JOBZ .EQ. 'V', NZ = M
W (global output) REAL array, dimension (N)
On normal exit, the first M entries contain the selected eigen-
values in ascending order.
Z (local output) COMPLEX array,
global dimension (N, N), local dimension ( LLD_Z, LOCc(JZ+N-1)
) If JOBZ = 'V', then on normal exit the first M columns of Z
contain the orthonormal eigenvectors of the matrix correspond-
ing to the selected eigenvalues.
If JOBZ = 'N', then Z is not referenced.
IZ (global input) INTEGER
Z's global row index, which points to the beginning of the sub-
matrix which is to be operated on.
It should be set to 1 when operating on a full matrix.
JZ (global input) INTEGER
Z's global column index, which points to the beginning of the
submatrix which is to be operated on.
It should be set to 1 when operating on a full matrix.
DESCZ (global and local input) INTEGER array of dimension DLEN_.
The array descriptor for the distributed matrix Z.
DESCZ( CTXT_ ) must equal DESCA( CTXT_ )
WORK (local workspace/output) COMPLEX array,
dimension (LWORK) WORK(1) returns workspace adequate workspace
to allow optimal performance.
LWORK (local input) INTEGER
Size of WORK array, must be at least 3.
If only eigenvalues are requested:
LWORK >= N + MAX( NB * ( NP00 + 1 ), NB * 3 )
If eigenvectors are requested:
LWORK >= N + ( NP00 + MQ00 + NB ) * NB
For definitions of NP00 & MQ00, see LRWORK.
For optimal performance, greater workspace is needed, i.e.
LWORK >= MAX( LWORK, NHETRD_LWORK )
Where LWORK is as defined above, and
NHETRD_LWORK = N + 2*( ANB+1 )*( 4*NPS+2 ) +
( NPS + 1 ) * NPS
ICTXT = DESCA( CTXT_ )
ANB = PJLAENV( ICTXT, 3, 'PCHETTRD', 'L', 0, 0, 0, 0 )
SQNPC = SQRT( DBLE( NPROW * NPCOL ) )
NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB )
If LWORK = -1, then LWORK is global input and a workspace query
is assumed; the routine only calculates the optimal size for
all work arrays. Each of these values is returned in the first
entry of the corresponding work array, and no error message is
issued by PXERBLA.
NOTE THAT FOR OPTIMAL PERFORMANCE, LWOPT IS RETURNED (THE OPTI-
MUM WORKSPACE) RATHER THAN THE MINIMUM NECESSARY WORKSPACE
LWMIN WHEN A WORKSPACE QUERY IS ISSUED.
FOR VERY SMALL MATRICES, LWOPT >> LWMIN.
RWORK (local workspace/output) REAL array,
dimension (LRWORK)
On return, RWORK(1) contains the optimal amount of workspace
required for efficient execution.
if JOBZ='N' RWORK(1) = optimal amount of workspace
required to compute the eigenvalues.
if JOBZ='V' RWORK(1) = optimal amount of workspace
required to compute eigenvalues and eigenvectors.
LRWORK (local input) INTEGER
Size of RWORK, must be at least 3.
See below for definitions of variables used to define LRWORK.
If no eigenvectors are requested (JOBZ = 'N') then
LRWORK >= 2 + 5 * N + MAX( 12 * N, NB * ( NP00 + 1 ) )
If eigenvectors are requested (JOBZ = 'V' ) then
the amount of workspace required is:
LRWORK >= 2 + 5 * N + MAX( 18*N, NP00 * MQ00 + 2 * NB * NB )
+ (2 + ICEIL( NEIG, NPROW*NPCOL))*N
Variable definitions:
NEIG = number of eigenvectors requested
NB = DESCA( MB_ ) = DESCA( NB_ ) =
DESCZ( MB_ ) = DESCZ( NB_ )
NN = MAX( N, NB, 2 )
DESCA( RSRC_ ) = DESCA( NB_ ) = DESCZ( RSRC_ ) =
DESCZ( CSRC_ ) = 0
NP00 = NUMROC( NN, NB, 0, 0, NPROW )
MQ00 = NUMROC( MAX( NEIG, NB, 2 ), NB, 0, 0, NPCOL )
ICEIL( X, Y ) is a ScaLAPACK function returning
ceiling(X/Y)
If LRWORK = -1, then LRWORK is global input and a workspace
query is assumed; the routine only calculates the size required
for optimal performance for all work arrays. Each of these val-
ues is returned in the first entry of the corresponding work
arrays, and no error message is issued by PXERBLA.
IWORK (local workspace) INTEGER array
On return, IWORK(1) contains the amount of integer workspace
required.
LIWORK (local input) INTEGER
size of IWORK
Let NNP = MAX( N, NPROW*NPCOL + 1, 4 ). Then:
LIWORK >= 12*NNP + 2*N when the eigenvectors are desired
LIWORK >= 10*NNP + 2*N when only the eigenvalues have to be
computed
If LIWORK = -1, then LIWORK is global input and a workspace
query is assumed; the routine only calculates the minimum and
optimal size for all work arrays. Each of these values is
returned in the first entry of the corresponding work array,
and no error message is issued by PXERBLA.
INFO (global output) INTEGER
= 0: successful exit
< 0: If the i-th argument is an array and the j-entry had
an illegal value, then INFO = -(i*100+j), if the i-th
argument is a scalar and had an illegal value, then
INFO = -i.
ALIGNMENT REQUIREMENTS
The distributed submatrices A(IA:*, JA:*) and Z(IZ:IZ+M-1,JZ:JZ+N-1)
must satisfy the following alignment properties:
1.Identical (quadratic) dimension:
DESCA(M_) = DESCZ(M_) = DESCA(N_) = DESCZ(N_)
2.Quadratic conformal blocking:
DESCA(MB_) = DESCA(NB_) = DESCZ(MB_) = DESCZ(NB_)
DESCA(RSRC_) = DESCZ(RSRC_)
3.MOD( IA-1, MB_A ) = MOD( IZ-1, MB_Z ) = 0
4.IAROW = IZROW
ScaLAPACK routine 31 October 2017 PCHEEVR(3)