PCHEEVR(3) ScaLAPACK routine of NEC Numeric Library Collection PCHEEVR(3) NAME PCHEEVR - computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A distributed in 2D blockcyclic format by calling the recommended sequence of ScaLAPACK routines SYNOPSIS SUBROUTINE PCHEEVR( JOBZ, RANGE, UPLO, N, A, IA, JA, DESCA, VL, VU, IL, IU, M, NZ, W, Z, IZ, JZ, DESCZ, WORK, LWORK, RWORK, LRWORK, IWORK, LIWORK, INFO ) CHARACTER JOBZ, RANGE, UPLO INTEGER IA, IL, INFO, IU, IZ, JA, JZ, LIWORK, LRWORK, LWORK, M, N, NZ REAL VL, VU INTEGER DESCA( * ), DESCZ( * ), IWORK( * ) REAL W( * ), RWORK( * ) COMPLEX A( * ), WORK( * ), Z( * ) PURPOSE PCHEEVR computes selected eigenvalues and, optionally, eigenvectors of a complex Hermitian matrix A distributed in 2D blockcyclic format by calling the recommended sequence of ScaLAPACK routines. First, the matrix A is reduced to real symmetric tridiagonal form. Then, the eigenproblem is solved using the parallel MRRR algorithm. Last, if eigenvectors have been computed, a backtransformation is done. Upon successful completion, each processor stores a copy of all com- puted eigenvalues in W. The eigenvector matrix Z is stored in 2D block- cyclic format distributed over all processors. Notes ===== Each global data object is described by an associated description vec- tor. This vector stores the information required to establish the map- ping between an object element and its corresponding process and memory location. Let A be a generic term for any 2D block cyclicly distributed array. Such a global array has an associated description vector DESCA. In the following comments, the character _ should be read as "of the global array". NOTATION STORED IN EXPLANATION --------------- -------------- -------------------------------------- DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, DTYPE_A = 1. CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating the BLACS process grid A is distribu- ted over. The context itself is glo- bal, but the handle (the integer value) may vary. M_A (global) DESCA( M_ ) The number of rows in the global array A. N_A (global) DESCA( N_ ) The number of columns in the global array A. MB_A (global) DESCA( MB_ ) The blocking factor used to distribute the rows of the array. NB_A (global) DESCA( NB_ ) The blocking factor used to distribute the columns of the array. RSRC_A (global) DESCA( RSRC_ ) The process row over which the first row of the array A is distributed. CSRC_A (global) DESCA( CSRC_ ) The process column over which the first column of the array A is distributed. LLD_A (local) DESCA( LLD_ ) The leading dimension of the local array. LLD_A >= MAX(1,LOCr(M_A)). Let K be the number of rows or columns of a distributed matrix, and assume that its process grid has dimension p x q. LOCr( K ) denotes the number of elements of K that a process would receive if K were distributed over the p processes of its process col- umn. Similarly, LOCc( K ) denotes the number of elements of K that a process would receive if K were distributed over the q processes of its process row. The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC: LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper bound for these quantities may be computed by: LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A PCHEEVR assumes IEEE 754 standard compliant arithmetic. ARGUMENTS JOBZ (global input) CHARACTER*1 Specifies whether or not to compute the eigenvectors: = 'N': Compute eigenvalues only. = 'V': Compute eigenvalues and eigenvectors. RANGE (global input) CHARACTER*1 = 'A': all eigenvalues will be found. = 'V': all eigenvalues in the interval [VL,VU] will be found. = 'I': the IL-th through IU-th eigenvalues will be found. UPLO (global input) CHARACTER*1 Specifies whether the upper or lower triangular part of the symmetric matrix A is stored: = 'U': Upper triangular = 'L': Lower triangular N (global input) INTEGER The number of rows and columns of the matrix A. N >= 0 A (local input/workspace) 2D block cyclic COMPLEX array, global dimension (N, N), local dimension ( LLD_A, LOCc(JA+N-1) ) (see Notes below for more detailed explanation of 2d arrays) On entry, the symmetric matrix A. If UPLO = 'U', only the upper triangular part of A is used to define the elements of the symmetric matrix. If UPLO = 'L', only the lower triangular part of A is used to define the elements of the symmetric matrix. On exit, the lower triangle (if UPLO='L') or the upper triangle (if UPLO='U') of A, including the diagonal, is destroyed. IA (global input) INTEGER A's global row index, which points to the beginning of the sub- matrix which is to be operated on. It should be set to 1 when operating on a full matrix. JA (global input) INTEGER A's global column index, which points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix. DESCA (global and local input) INTEGER array of dimension DLEN_. (The ScaLAPACK descriptor length is DLEN_ = 9.) The array descriptor for the distributed matrix A. The descriptor stores details about the 2D block-cyclic stor- age, see the notes below. If DESCA is incorrect, PCHEEVR cannot work correctly. Also note the array alignment requirements specified below VL (global input) REAL If RANGE='V', the lower bound of the interval to be searched for eigenvalues. Not referenced if RANGE = 'A' or 'I'. VU (global input) REAL If RANGE='V', the upper bound of the interval to be searched for eigenvalues. Not referenced if RANGE = 'A' or 'I'. IL (global input) INTEGER If RANGE='I', the index (from smallest to largest) of the smallest eigenvalue to be returned. IL >= 1. Not referenced if RANGE = 'A'. IU (global input) INTEGER If RANGE='I', the index (from smallest to largest) of the largest eigenvalue to be returned. min(IL,N) <= IU <= N. Not referenced if RANGE = 'A'. M (global output) INTEGER Total number of eigenvalues found. 0 <= M <= N. NZ (global output) INTEGER Total number of eigenvectors computed. 0 <= NZ <= M. The number of columns of Z that are filled. If JOBZ .NE. 'V', NZ is not referenced. If JOBZ .EQ. 'V', NZ = M W (global output) REAL array, dimension (N) On normal exit, the first M entries contain the selected eigen- values in ascending order. Z (local output) COMPLEX array, global dimension (N, N), local dimension ( LLD_Z, LOCc(JZ+N-1) ) If JOBZ = 'V', then on normal exit the first M columns of Z contain the orthonormal eigenvectors of the matrix correspond- ing to the selected eigenvalues. If JOBZ = 'N', then Z is not referenced. IZ (global input) INTEGER Z's global row index, which points to the beginning of the sub- matrix which is to be operated on. It should be set to 1 when operating on a full matrix. JZ (global input) INTEGER Z's global column index, which points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix. DESCZ (global and local input) INTEGER array of dimension DLEN_. The array descriptor for the distributed matrix Z. DESCZ( CTXT_ ) must equal DESCA( CTXT_ ) WORK (local workspace/output) COMPLEX array, dimension (LWORK) WORK(1) returns workspace adequate workspace to allow optimal performance. LWORK (local input) INTEGER Size of WORK array, must be at least 3. If only eigenvalues are requested: LWORK >= N + MAX( NB * ( NP00 + 1 ), NB * 3 ) If eigenvectors are requested: LWORK >= N + ( NP00 + MQ00 + NB ) * NB For definitions of NP00 & MQ00, see LRWORK. For optimal performance, greater workspace is needed, i.e. LWORK >= MAX( LWORK, NHETRD_LWORK ) Where LWORK is as defined above, and NHETRD_LWORK = N + 2*( ANB+1 )*( 4*NPS+2 ) + ( NPS + 1 ) * NPS ICTXT = DESCA( CTXT_ ) ANB = PJLAENV( ICTXT, 3, 'PCHETTRD', 'L', 0, 0, 0, 0 ) SQNPC = SQRT( DBLE( NPROW * NPCOL ) ) NPS = MAX( NUMROC( N, 1, 0, 0, SQNPC ), 2*ANB ) If LWORK = -1, then LWORK is global input and a workspace query is assumed; the routine only calculates the optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by PXERBLA. NOTE THAT FOR OPTIMAL PERFORMANCE, LWOPT IS RETURNED (THE OPTI- MUM WORKSPACE) RATHER THAN THE MINIMUM NECESSARY WORKSPACE LWMIN WHEN A WORKSPACE QUERY IS ISSUED. FOR VERY SMALL MATRICES, LWOPT >> LWMIN. RWORK (local workspace/output) REAL array, dimension (LRWORK) On return, RWORK(1) contains the optimal amount of workspace required for efficient execution. if JOBZ='N' RWORK(1) = optimal amount of workspace required to compute the eigenvalues. if JOBZ='V' RWORK(1) = optimal amount of workspace required to compute eigenvalues and eigenvectors. LRWORK (local input) INTEGER Size of RWORK, must be at least 3. See below for definitions of variables used to define LRWORK. If no eigenvectors are requested (JOBZ = 'N') then LRWORK >= 2 + 5 * N + MAX( 12 * N, NB * ( NP00 + 1 ) ) If eigenvectors are requested (JOBZ = 'V' ) then the amount of workspace required is: LRWORK >= 2 + 5 * N + MAX( 18*N, NP00 * MQ00 + 2 * NB * NB ) + (2 + ICEIL( NEIG, NPROW*NPCOL))*N Variable definitions: NEIG = number of eigenvectors requested NB = DESCA( MB_ ) = DESCA( NB_ ) = DESCZ( MB_ ) = DESCZ( NB_ ) NN = MAX( N, NB, 2 ) DESCA( RSRC_ ) = DESCA( NB_ ) = DESCZ( RSRC_ ) = DESCZ( CSRC_ ) = 0 NP00 = NUMROC( NN, NB, 0, 0, NPROW ) MQ00 = NUMROC( MAX( NEIG, NB, 2 ), NB, 0, 0, NPCOL ) ICEIL( X, Y ) is a ScaLAPACK function returning ceiling(X/Y) If LRWORK = -1, then LRWORK is global input and a workspace query is assumed; the routine only calculates the size required for optimal performance for all work arrays. Each of these val- ues is returned in the first entry of the corresponding work arrays, and no error message is issued by PXERBLA. IWORK (local workspace) INTEGER array On return, IWORK(1) contains the amount of integer workspace required. LIWORK (local input) INTEGER size of IWORK Let NNP = MAX( N, NPROW*NPCOL + 1, 4 ). Then: LIWORK >= 12*NNP + 2*N when the eigenvectors are desired LIWORK >= 10*NNP + 2*N when only the eigenvalues have to be computed If LIWORK = -1, then LIWORK is global input and a workspace query is assumed; the routine only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by PXERBLA. INFO (global output) INTEGER = 0: successful exit < 0: If the i-th argument is an array and the j-entry had an illegal value, then INFO = -(i*100+j), if the i-th argument is a scalar and had an illegal value, then INFO = -i. ALIGNMENT REQUIREMENTS The distributed submatrices A(IA:*, JA:*) and Z(IZ:IZ+M-1,JZ:JZ+N-1) must satisfy the following alignment properties: 1.Identical (quadratic) dimension: DESCA(M_) = DESCZ(M_) = DESCA(N_) = DESCZ(N_) 2.Quadratic conformal blocking: DESCA(MB_) = DESCA(NB_) = DESCZ(MB_) = DESCZ(NB_) DESCA(RSRC_) = DESCZ(RSRC_) 3.MOD( IA-1, MB_A ) = MOD( IZ-1, MB_Z ) = 0 4.IAROW = IZROW ScaLAPACK routine 31 October 2017 PCHEEVR(3)