PZHENGST

PZHENGST(3)   ScaLAPACK routine of NEC Numeric Library Collection  PZHENGST(3)



NAME
       PZHEGST  - reduce a complex Hermitian-definite generalized eigenproblem
       to standard form

SYNOPSIS
       SUBROUTINE PZHENGST(
                           IBTYPE, UPLO, N, A,  IA,  JA,  DESCA,  B,  IB,  JB,
                           DESCB, SCALE, WORK, LWORK, INFO )

           CHARACTER       UPLO

           INTEGER         IA, IB, IBTYPE, INFO, JA, JB, LWORK, N

           DOUBLE          PRECISION SCALE

           INTEGER         DESCA( * ), DESCB( * )

           COMPLEX*16      A( * ), B( * ), WORK( * )

PURPOSE
       PZHENGST  reduces a complex Hermitian-definite generalized eigenproblem
       to standard form.

       PZHENGST performs the same function as PZHEGST, but is based on rank 2K
       updates, which are faster and more scalable than triangular solves (the
       basis of PZHENGST).

       PZHENGST calls PZHEGST when UPLO='U', hence ZDHENGST provides  improved
       performance only when UPLO='L', IBTYPE=1.

       PZHENGST  also  calls  PZHEGST when insufficient workspace is provided,
       hence PZHENGST provides improved performance only when LWORK >= 2 * NP0
       * NB + NQ0 * NB + NB * NB

       In  the following sub( A ) denotes A( IA:IA+N-1, JA:JA+N-1 ) and sub( B
       ) denotes B( IB:IB+N-1, JB:JB+N-1 ).

       If IBTYPE = 1, the problem is sub( A )*x = lambda*sub( B )*x, and  sub(
       A  )  is  overwritten  by  inv(U**H)*sub(  A  )*inv(U) or inv(L)*sub( A
       )*inv(L**H)

       If IBTYPE = 2 or 3, the problem is sub( A )*sub( B )*x  =  lambda*x  or
       sub( B )*sub( A )*x = lambda*x, and sub( A ) is overwritten by U*sub( A
       )*U**H or L**H*sub( A )*L.

       sub( B ) must have been previously factorized as U**H*U  or  L*L**H  by
       PZPOTRF.

       Notes
       =====

       Each  global data object is described by an associated description vec-
       tor.  This vector stores the information required to establish the map-
       ping between an object element and its corresponding process and memory
       location.

       Let A be a generic term for any 2D block  cyclicly  distributed  array.
       Such a global array has an associated description vector DESCA.  In the
       following comments, the character _ should be read as  "of  the  global
       array".

       NOTATION        STORED IN      EXPLANATION
       --------------- -------------- --------------------------------------
       DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
                                      DTYPE_A = 1.
       CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating
                                      the BLACS process grid A is distribu-
                                      ted over. The context itself is glo-
                                      bal, but the handle (the integer
                                      value) may vary.
       M_A    (global) DESCA( M_ )    The number of rows in the global
                                      array A.
       N_A    (global) DESCA( N_ )    The number of columns in the global
                                      array A.
       MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
                                      the rows of the array.
       NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
                                      the columns of the array.
       RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
                                      row  of  the  array  A  is  distributed.
       CSRC_A (global) DESCA( CSRC_ ) The process column over which the
                                      first column of the array A is
                                      distributed.
       LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
                                      array.  LLD_A >= MAX(1,LOCr(M_A)).

       Let K be the number of rows or columns of  a  distributed  matrix,  and
       assume that its process grid has dimension p x q.
       LOCr(  K  )  denotes  the  number of elements of K that a process would
       receive if K were distributed over the p processes of its process  col-
       umn.
       Similarly, LOCc( K ) denotes the number of elements of K that a process
       would receive if K were distributed over the q processes of its process
       row.
       The  values  of  LOCr()  and LOCc() may be determined via a call to the
       ScaLAPACK tool function, NUMROC:
               LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
               LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).  An  upper
       bound for these quantities may be computed by:
               LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
               LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A


ARGUMENTS
       IBTYPE   (global input) INTEGER
                =  1:  compute  inv(U**H)*sub(  A  )*inv(U)  or  inv(L)*sub( A
                )*inv(L**H);
                = 2 or 3: compute U*sub( A )*U**H or L**H*sub( A )*L.

       UPLO    (global input) CHARACTER
               = 'U':  Upper triangle of sub( A ) is stored and sub(  B  )  is
               factored as U**H*U;
               =  'L':   Lower  triangle of sub( A ) is stored and sub( B ) is
               factored as L*L**H.

       N       (global input) INTEGER
               The order of the matrices sub( A ) and sub( B ).  N >= 0.

       A       (local input/local output) COMPLEX*16 pointer into the
               local memory to an array of  dimension  (LLD_A,  LOCc(JA+N-1)).
               On  entry,  this  array contains the local pieces of the N-by-N
               Hermitian distributed matrix sub( A ). If UPLO = 'U', the lead-
               ing N-by-N upper triangular part of sub( A ) contains the upper
               triangular part of the matrix, and its strictly lower  triangu-
               lar  part is not referenced.  If UPLO = 'L', the leading N-by-N
               lower triangular part of sub( A ) contains the lower triangular
               part  of  the matrix, and its strictly upper triangular part is
               not referenced.

               On exit, if INFO = 0, the transformed  matrix,  stored  in  the
               same format as sub( A ).

       IA      (global input) INTEGER
               A's global row index, which points to the beginning of the sub-
               matrix which is to be operated on.

       JA      (global input) INTEGER
               A's global column index, which points to the beginning  of  the
               submatrix which is to be operated on.

       DESCA   (global and local input) INTEGER array of dimension DLEN_.
               The array descriptor for the distributed matrix A.

       B       (local input) COMPLEX*16 pointer into the local memory
               to  an array of dimension (LLD_B, LOCc(JB+N-1)). On entry, this
               array contains the local pieces of the triangular  factor  from
               the Cholesky factorization of sub( B ), as returned by PZPOTRF.

       IB      (global input) INTEGER
               B's global row index, which points to the beginning of the sub-
               matrix which is to be operated on.

       JB      (global input) INTEGER
               B's  global  column index, which points to the beginning of the
               submatrix which is to be operated on.

       DESCB   (global and local input) INTEGER array of dimension DLEN_.
               The array descriptor for the distributed matrix B.

       SCALE   (global output) DOUBLE PRECISION
               Amount by which the eigenvalues should be scaled to  compensate
               for  the  scaling performed in this routine.  At present, SCALE
               is always returned as 1.0, it is returned  here  to  allow  for
               future enhancement.

       WORK    (local workspace/local output) COLPLEX*16 array,
               On exit, WORK( 1 ) returns the minimal and optimal LWORK.

       LWORK   (local or global input) INTEGER
               The dimension of the array WORK.  LWORK is local input and must
               be at least LWORK >= MAX( NB * ( NP0 +1 ), 3 * NB )

               When IBTYPE = 1 and UPLO = 'L', PZHENGST provides improved per-
               formance when LWORK >= 2 * NP0 * NB + NQ0 * NB + NB * NB

               where NB = MB_A = NB_A, NP0 = NUMROC( N, NB, 0, 0, NPROW ), NQ0
               = NUMROC( N, NB, 0, 0, NPROW ),

               NUMROC ia a ScaLAPACK tool functions MYROW,  MYCOL,  NPROW  and
               NPCOL  can  be determined by calling the subroutine BLACS_GRID-
               INFO.

               If LWORK = -1, then LWORK is global input and a workspace query
               is  assumed;  the  routine only calculates the optimal size for
               all work arrays. Each of these values is returned in the  first
               entry  of the corresponding work array, and no error message is
               issued by PXERBLA.

       INFO    (global output) INTEGER
               = 0:  successful exit
               < 0:  If the i-th argument is an array and the j-entry  had  an
               illegal  value, then INFO = -(i*100+j), if the i-th argument is
               a scalar and had an illegal value, then INFO = -i.



ScaLAPACK routine               31 October 2017                    PZHENGST(3)