PDSYEVR(3) ScaLAPACK routine of NEC Numeric Library Collection PDSYEVR(3) NAME PDSYEVR - computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A distributed in 2D blockcyclic format by calling the recommended sequence of ScaLAPACK routines SYNOPSIS SUBROUTINE PDSYEVR( JOBZ, RANGE, UPLO, N, A, IA, JA, DESCA, VL, VU, IL, IU, M, NZ, W, Z, IZ, JZ, DESCZ, WORK, LWORK, IWORK, LIWORK,INFO ) CHARACTER JOBZ, RANGE, UPLO INTEGER IA, IL, INFO, IU, IZ, JA, JZ, LIWORK, LWORK, M, N, NZ DOUBLE PRECISION VL, VU INTEGER DESCA( * ), DESCZ( * ), IWORK( * ) DOUBLE PRECISION A( * ), W( * ), WORK( * ), Z( * ) PURPOSE PDSYEVR computes selected eigenvalues and, optionally, eigenvectors of a real symmetric matrix A distributed in 2D blockcyclic format by call- ing the recommended sequence of ScaLAPACK routines. First, the matrix A is reduced to real symmetric tridiagonal form. Then, the eigenproblem is solved using the parallel MRRR algorithm. Last, if eigenvectors have been computed, a backtransformation is done. Upon successful completion, each processor stores a copy of all com- puted eigenvalues in W. The eigenvector matrix Z is stored in 2D block- cyclic format distributed over all processors. Note that subsets of eigenvalues/vectors can be selected by specifying a range of values or a range of indices for the desired eigenvalues. Notes ===== Each global data object is described by an associated description vec- tor. This vector stores the information required to establish the map- ping between an object element and its corresponding process and memory location. Let A be a generic term for any 2D block cyclicly distributed array. Such a global array has an associated description vector DESCA, or DESCZ for the descriptor of Z, etc. The length of a ScaLAPACK descrip- tor is nine. In the following comments, the character _ should be read as "of the global array". NOTATION STORED IN EXPLANATION --------------- -------------- -------------------------------------- DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, DTYPE_A = 1. CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating the BLACS process grid A is distribu- ted over. The context itself is glo- bal, but the handle (the integer value) may vary. M_A (global) DESCA( M_ ) The number of rows in the global array A. N_A (global) DESCA( N_ ) The number of columns in the global array A. MB_A (global) DESCA( MB_ ) The blocking factor used to distribute the rows of the array. NB_A (global) DESCA( NB_ ) The blocking factor used to distribute the columns of the array. RSRC_A (global) DESCA( RSRC_ ) The process row over which the first row of the array A is distributed. CSRC_A (global) DESCA( CSRC_ ) The process column over which the first column of the array A is distributed. LLD_A (local) DESCA( LLD_ ) The leading dimension of the local array. LLD_A >= MAX(1,LOCr(M_A)). Let K be the number of rows or columns of a distributed matrix, and assume that its process grid has dimension p x q. LOCr( K ) denotes the number of elements of K that a process would receive if K were distributed over the p processes of its process col- umn. Similarly, LOCc( K ) denotes the number of elements of K that a process would receive if K were distributed over the q processes of its process row. The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC: LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper bound for these quantities may be computed by: LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A PDSYEVR assumes IEEE 754 standard compliant arithmetic. ARGUMENTS JOBZ (global input) CHARACTER*1 Specifies whether or not to compute the eigenvectors: = 'N': Compute eigenvalues only. = 'V': Compute eigenvalues and eigenvectors. RANGE (global input) CHARACTER*1 = 'A': all eigenvalues will be found. = 'V': all eigenvalues in the interval [VL,VU] will be found. = 'I': the IL-th through IU-th eigenvalues will be found. UPLO (global input) CHARACTER*1 Specifies whether the upper or lower triangular part of the symmetric matrix A is stored: = 'U': Upper triangular = 'L': Lower triangular N (global input) INTEGER The number of rows and columns of the matrix A. N >= 0 A (local input/workspace) 2D block cyclic DOUBLE PRECISION array, global dimension (N, N), local dimension ( LLD_A, LOCc(JA+N-1) ), (see Notes below for more detailed explanation of 2d arrays) On entry, the symmetric matrix A. If UPLO = 'U', only the upper triangular part of A is used to define the elements of the symmetric matrix. If UPLO = 'L', only the lower triangular part of A is used to define the elements of the symmetric matrix. On exit, the lower triangle (if UPLO='L') or the upper triangle (if UPLO='U') of A, including the diagonal, is destroyed. IA (global input) INTEGER A's global row index, which points to the beginning of the sub- matrix which is to be operated on. It should be set to 1 when operating on a full matrix. JA (global input) INTEGER A's global column index, which points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix. DESCA (global and local input) INTEGER array of dimension DLEN=9. The array descriptor for the distributed matrix A. The descriptor stores details about the 2D block-cyclic stor- age, see the notes below. If DESCA is incorrect, PDSYEVR cannot guarantee correct error reporting. Also note the array alignment requirements specified below. VL (global input) DOUBLE PRECISION If RANGE='V', the lower bound of the interval to be searched for eigenvalues. Not referenced if RANGE = 'A' or 'I'. VU (global input) DOUBLE PRECISION If RANGE='V', the upper bound of the interval to be searched for eigenvalues. Not referenced if RANGE = 'A' or 'I'. IL (global input) INTEGER If RANGE='I', the index (from smallest to largest) of the smallest eigenvalue to be returned. IL >= 1. Not referenced if RANGE = 'A'. IU (global input) INTEGER If RANGE='I', the index (from smallest to largest) of the largest eigenvalue to be returned. min(IL,N) <= IU <= N. Not referenced if RANGE = 'A'. M (global output) INTEGER Total number of eigenvalues found. 0 <= M <= N. NZ (global output) INTEGER Total number of eigenvectors computed. 0 <= NZ <= M. The number of columns of Z that are filled. If JOBZ .NE. 'V', NZ is not referenced. If JOBZ .EQ. 'V', NZ = M W (global output) DOUBLE PRECISION array, dimension (N) Upon successful exit, the first M entries contain the selected eigenvalues in ascending order. Z (local output) DOUBLE PRECISION array, global dimension (N, N), local dimension ( LLD_Z, LOCc(JZ+N-1) ) (see Notes below for more detailed explanation of 2d arrays) If JOBZ = 'V', then on normal exit the first M columns of Z contain the orthonormal eigenvectors of the matrix correspond- ing to the selected eigenvalues. If JOBZ = 'N', then Z is not referenced. IZ (global input) INTEGER Z's global row index, which points to the beginning of the sub- matrix which is to be operated on. It should be set to 1 when operating on a full matrix. JZ (global input) INTEGER Z's global column index, which points to the beginning of the submatrix which is to be operated on. It should be set to 1 when operating on a full matrix. DESCZ (global and local input) INTEGER array of dimension DLEN_. The array descriptor for the distributed matrix Z. The context DESCZ( CTXT_ ) must equal DESCA( CTXT_ ). Also note the array alignment requirements specified below. WORK (local workspace/output) DOUBLE PRECISION array, dimension (LWORK) On return, WORK(1) contains the optimal amount of workspace required for efficient execution. if JOBZ='N' WORK(1) = optimal amount of workspace required to compute the eigenvalues. if JOBZ='V' WORK(1) = optimal amount of workspace required to compute eigenvalues and eigenvectors. LWORK (local input) INTEGER Size of WORK, must be at least 3. See below for definitions of variables used to define LWORK. If no eigenvectors are requested (JOBZ = 'N') then LWORK >= 2 + 5*N + MAX( 12 * NN, NB * ( NP0 + 1 ) ) If eigenvectors are requested (JOBZ = 'V' ) then the amount of workspace required is: LWORK >= 2 + 5*N + MAX( 18*NN, NP0 * MQ0 + 2 * NB * NB ) + (2 + ICEIL( NEIG, NPROW*NPCOL))*NN Variable definitions: NEIG = number of eigenvectors requested NB = DESCA( MB_ ) = DESCA( NB_ ) = DESCZ( MB_ ) = DESCZ( NB_ ) NN = MAX( N, NB, 2 ) DESCA( RSRC_ ) = DESCA( NB_ ) = DESCZ( RSRC_ ) = DESCZ( CSRC_ ) = 0 NP0 = NUMROC( NN, NB, 0, 0, NPROW ) MQ0 = NUMROC( MAX( NEIG, NB, 2 ), NB, 0, 0, NPCOL ) ICEIL( X, Y ) is a ScaLAPACK function returning ceiling(X/Y) If LWORK = -1, then LWORK is global input and a workspace query is assumed; the routine only calculates the size required for optimal performance for all work arrays. Each of these values is returned in the first entry of the corresponding work arrays, and no error message is issued by PXERBLA. Note that in a workspace query, for performance the optimal workspace LWOPT is returned rather than the minimum necessary WORKSPACE LWMIN. For very small matrices, LWOPT >> LWMIN. IWORK (local workspace) INTEGER array On return, IWORK(1) contains the amount of integer workspace required. LIWORK (local input) INTEGER size of IWORK Let NNP = MAX( N, NPROW*NPCOL + 1, 4 ). Then: LIWORK >= 12*NNP + 2*N when the eigenvectors are desired LIWORK >= 10*NNP + 2*N when only the eigenvalues have to be computed If LIWORK = -1, then LIWORK is global input and a workspace query is assumed; the routine only calculates the minimum and optimal size for all work arrays. Each of these values is returned in the first entry of the corresponding work array, and no error message is issued by PXERBLA. INFO (global output) INTEGER = 0: successful exit < 0: If the i-th argument is an array and the j-entry had an illegal value, then INFO = -(i*100+j), if the i-th argument is a scalar and had an illegal value, then INFO = -i. ALIGNMENT REQUIREMENTS The distributed submatrices A(IA:*, JA:*) and Z(IZ:IZ+M-1,JZ:JZ+N-1) must satisfy the following alignment properties: 1.Identical (quadratic) dimension: DESCA(M_) = DESCZ(M_) = DESCA(N_) = DESCZ(N_) 2.Quadratic conformal blocking: DESCA(MB_) = DESCA(NB_) = DESCZ(MB_) = DESCZ(NB_) DESCA(RSRC_) = DESCZ(RSRC_) 3.MOD( IA-1, MB_A ) = MOD( IZ-1, MB_Z ) = 0 4.IAROW = IZROW ScaLAPACK routine 31 October 2017 PDSYEVR(3)