PDLATRZ(3) ScaLAPACK routine of NEC Numeric Library Collection PDLATRZ(3) NAME PDLATRZ - reduce the M-by-N ( M<=N ) real upper trapezoidal matrix sub( A ) = [ A(IA:IA+M-1,JA:JA+M-1) A(IA:IA+M-1,JA+N-L:JA+N-1) ] to upper triangular form by means of orthogonal transformations SYNOPSIS SUBROUTINE PDLATRZ( M, N, L, A, IA, JA, DESCA, TAU, WORK ) INTEGER IA, JA, L, M, N INTEGER DESCA( * ) DOUBLE PRECISION A( * ), TAU( * ), WORK( * ) PURPOSE PDLATRZ reduces the M-by-N ( M<=N ) real upper trapezoidal matrix sub( A ) = [ A(IA:IA+M-1,JA:JA+M-1) A(IA:IA+M-1,JA+N-L:JA+N-1) ] to upper triangular form by means of orthogonal transformations. The upper trapezoidal matrix sub( A ) is factored as sub( A ) = ( R 0 ) * Z, where Z is an N-by-N orthogonal matrix and R is an M-by-M upper trian- gular matrix. Notes ===== Each global data object is described by an associated description vec- tor. This vector stores the information required to establish the map- ping between an object element and its corresponding process and memory location. Let A be a generic term for any 2D block cyclicly distributed array. Such a global array has an associated description vector DESCA. In the following comments, the character _ should be read as "of the global array". NOTATION STORED IN EXPLANATION --------------- -------------- -------------------------------------- DTYPE_A(global) DESCA( DTYPE_ )The descriptor type. In this case, DTYPE_A = 1. CTXT_A (global) DESCA( CTXT_ ) The BLACS context handle, indicating the BLACS process grid A is distribu- ted over. The context itself is glo- bal, but the handle (the integer value) may vary. M_A (global) DESCA( M_ ) The number of rows in the global array A. N_A (global) DESCA( N_ ) The number of columns in the global array A. MB_A (global) DESCA( MB_ ) The blocking factor used to distribute the rows of the array. NB_A (global) DESCA( NB_ ) The blocking factor used to distribute the columns of the array. RSRC_A (global) DESCA( RSRC_ ) The process row over which the first row of the array A is distributed. CSRC_A (global) DESCA( CSRC_ ) The process column over which the first column of the array A is distributed. LLD_A (local) DESCA( LLD_ ) The leading dimension of the local array. LLD_A >= MAX(1,LOCr(M_A)). Let K be the number of rows or columns of a distributed matrix, and assume that its process grid has dimension p x q. LOCr( K ) denotes the number of elements of K that a process would receive if K were distributed over the p processes of its process col- umn. Similarly, LOCc( K ) denotes the number of elements of K that a process would receive if K were distributed over the q processes of its process row. The values of LOCr() and LOCc() may be determined via a call to the ScaLAPACK tool function, NUMROC: LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ), LOCc( N ) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ). An upper bound for these quantities may be computed by: LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A ARGUMENTS M (global input) INTEGER The number of rows to be operated on, i.e. the number of rows of the distributed submatrix sub( A ). M >= 0. N (global input) INTEGER The number of columns to be operated on, i.e. the number of columns of the distributed submatrix sub( A ). N >= 0. L (global input) INTEGER The columns of the distributed submatrix sub( A ) containing the meaningful part of the Householder reflectors. L > 0. A (local input/local output) DOUBLE PRECISION pointer into the local memory to an array of dimension (LLD_A, LOCc(JA+N-1)). On entry, the local pieces of the M-by-N distributed matrix sub( A ) which is to be factored. On exit, the leading M-by-M upper triangular part of sub( A ) contains the upper trian- gular matrix R, and elements N-L+1 to N of the first M rows of sub( A ), with the array TAU, represent the orthogonal matrix Z as a product of M elementary reflectors. IA (global input) INTEGER The row index in the global array A indicating the first row of sub( A ). JA (global input) INTEGER The column index in the global array A indicating the first column of sub( A ). DESCA (global and local input) INTEGER array of dimension DLEN_. The array descriptor for the distributed matrix A. TAU (local output) DOUBLE PRECISION array, dimension LOCr(IA+M-1) This array contains the scalar factors of the elementary reflectors. TAU is tied to the distributed matrix A. WORK (local workspace) DOUBLE PRECISION array, dimension (LWORK) LWORK >= Nq0 + MAX( 1, Mp0 ), where IROFF = MOD( IA-1, MB_A ), ICOFF = MOD( JA-1, NB_A ), IAROW = INDXG2P( IA, MB_A, MYROW, RSRC_A, NPROW ), IACOL = INDXG2P( JA, NB_A, MYCOL, CSRC_A, NPCOL ), Mp0 = NUMROC( M+IROFF, MB_A, MYROW, IAROW, NPROW ), Nq0 = NUMROC( N+ICOFF, NB_A, MYCOL, IACOL, NPCOL ), and NUMROC, INDXG2P are ScaLAPACK tool functions; MYROW, MYCOL, NPROW and NPCOL can be determined by calling the subroutine BLACS_GRIDINFO. FURTHER DETAILS The factorization is obtained by Householder's method. The kth trans- formation matrix, Z( k ), which is used to introduce zeros into the (m - k + 1)th row of sub( A ), is given in the form Z( k ) = ( I 0 ), ( 0 T( k ) ) where T( k ) = I - tau*u( k )*u( k )', u( k ) = ( 1 ), ( 0 ) ( z( k ) ) tau is a scalar and z( k ) is an ( n - m ) element vector. tau and z( k ) are chosen to annihilate the elements of the kth row of sub( A ). The scalar tau is returned in the kth element of TAU and the vector u( k ) in the kth row of sub( A ), such that the elements of z( k ) are in a( k, m + 1 ), ..., a( k, n ). The elements of R are returned in the upper triangular part of sub( A ). Z is given by Z = Z( 1 ) * Z( 2 ) * ... * Z( m ). ScaLAPACK routine 31 October 2017 PDLATRZ(3)