Please refer to the following document published by the MPI Forum for further details:
2.2   Execution Management and Environment Querying
2.2.1   Initialization and Finalization of MPI Environment
Programs that use MPI must call the procedure MPI_INIT
or MPI_INIT_THREAD to initialize the
MPI environment, and the procedure MPI_FINALIZE to finalize.
Users can call various MPI procedures, such as point-to-point and collective communication between the initialization and finalization. The procedures MPI_INITIALIZED, MPI_FINALIZED, and MPI_GET_VERSION can be called at any time even before the invocation of the procedure MPI_INIT or MPI_INIT_THREAD and after that of the procedure MPI_FINALIZE. The procedure MPI_INITIALIZED determines whether or not the procedure MPI_INIT or MPI_INIT_THREAD has already been called, and the procedure MPI_FINALIZED determines whether or not the procedure MPI_FINALIZE has already been called.
The following is an example of a Fortran program that calls the MPI procedures
for the initialization and finalization.
PROGRAM MAIN use mpi INTEGER IERROR ! Establish the MPI environment CALL MPI_INIT(IERROR) ! Terminate the MPI environment CALL MPI_FINALIZE(IERROR) STOP END |
int main(int argc, char **argv){ /* Establish the MPI environment */ MPI_Init( &argc, &argv ); /* Terminate the MPI environment */ MPI_Finalize(); } |
2.2.2   Opaque Object and Handle
MPI objects managed by MPI such as communicators and MPI datatypes are
called opaque objects.
MPI defines data called handles corresponding to respective opaque objects.
Users can indirectly operate on opaque objects by passing corresponding handles to
MPI procedures, not directly access them.
2.2.3   Error Handling
MPI defines an error processing mechanism, which enables users
to handle errors detected in MPI procedures, such as invalid
arguments. With this mechanism, when an error is detected,
the corresponding user-defined function called error handler is invoked.
Please refer to the description for details.
2.3   Process Management
2.3.1   MPI Process
MPI processes are units of processing executed independently with each other
with a separate CPU resource and memory space each.
Execution of an MPI program starts at the beginning
of the program. However, MPI procedures can be invoked only after the
procedure MPI_INIT or MPI_INIT_THREAD is
invoked (with the exceptions described in the description).
The dynamic process creation facility enables creation of additional MPI processes
during the execution of an MPI program. This facility is
explained in the description.
2.3.2   Communicator
Ordered sets of MPI processes are called groups. Processes in a group
have unique identification numbers of integer value starting at zero,
which are called ranks.
A communicator indicates
a communication scope corresponding to
a group of MPI processes that can communicate with
each other.
A communicator specifies a group, and MPI processes are identified
by ranks in the group. This identification is needed for specifying
processes to or from which messages are transferred.
Users can define a new communicator corresponding to a new group. MPI defines the predefined communicator MPI_COMM_WORLD as an MPI reserved name. The communicator MPI_COMM_WORLD is available after the procedure MPI_INIT or MPI_INIT_THREAD is called, and the group corresponding to the communicator contains all MPI processes that are started with the MPI execution commands.
Communicators are classified into intra-communicators and inter-communicators. The former consists of only one group of MPI processes. In communication on an intra-communicator, messages are transferred within the group. The latter consists of two disjoint groups. For a process in an intra-communicator, one group is called a local group, which the process belongs, and the other is a remote group to which the process does not belong. In communication on an inter-communicator, messages are always transferred between two groups.
2.3.3   Determining the Number of MPI Processes
The total number of MPI processes at the beginning of a program can
be obtained by calling the procedure MPI_COMM_SIZE with the communicator
MPI_COMM_WORLD as an argument.
2.3.4   Ranking and Identification of MPI Processes
MPI processes can be identified using a communicator and ranks, which are
the identification numbers of processes in a communicator. Ranks have
integer values in
the range from 0 to one less than the
number of MPI processes in the communicator. The rank of the
executing process can be obtained
by calling the procedure MPI_COMM_RANK using the communicator as
an argument.
Users can construct a group of specified MPI processes and assign a communicator to the group. When a communicator is defined, ranks in the communicator are assigned to the processes. Thus, if an MPI process belongs to two communicator groups, the MPI process generally has different ranks in the respective communicators.
2.4   Dynamic Process Management
The dynamic process management facility consists of
Through the dynamic process creation facility, new MPI processes can be started dynamically during the execution of a program. In order to do so, there are two procedures MPI_COMM_SPAWN and MPI_COMM_SPAWN_MULTIPLE. The former can be used to spawn MPI processes from a single executable file with the same arguments. The latter can be used to spawn MPI processes from two or more executable files. This latter call must be used if a single executable file is used but with different arguments for each process.
As a result of dynamic process creation, a new communicator MPI_COMM_WORLD, which is completely different from the existing communicator MPI_COMM_WORLD, is generated and contains the spawned MPI processes. Furthermore, an inter-communicator that consists of the existing communicator MPI_COMM_WORLD and the new MPI_COMM_WORLD is generated.
MPI processes started by the procedure MPI_COMM_SPAWN or MPI_COMM_SPAWN_MULTIPLE run as if they were started with the MPI execution commands. Child processes started can obtain an inter-communicator that is necessary to communicate their parent processes.
The functions for dynamic process connection can be used to establish communication between two groups of MPI processes that do not share a communicator. The procedures are MPI_OPEN_PORT, MPI_COMM_ACCEPT, and MPI_CLOSE_PORT for the server, and the procedure MPI_COMM_CONNECT for the client. Both groups will be connected by an inter-communicator. The communication between two groups will be performed by TCP/IP function calls within NEC MPI. Note that only MPI applications that use the same numerical storage width can be connected by dynamic process connection.
2.5   Inter-Process Communication
Inter-Process Communication functionality can be classified into
2.5.1   Point-to-Point Communication
In point-to-point communication, communication between two MPI processes
is established when one MPI process calls the send procedure and the other
MPI process calls the receive procedure.
The arguments used to call either a send procedure or a receive procedure include the user data, which is the object of the communication, its type and size, the communicator and the rank that identify the party with whom a communication is established, and a tag that identifies a communication request. A tag is attached to a communication message in a send operation. In receive operation, it is used to identify a communication message. A communication message includes not only user data but also its type and size, and rank and tag information.
Specifying the MPI reserved name MPI_ANY_TAG as the tag in the receive call makes it possible to receive a message that is sent by the party with the specified rank, regardless the value of its tag. Moreover, specifying the MPI reserved name MPI_ANY_SOURCE as the rank makes it possible to receive the message having the specified tag value without specifying the party. When MPI_ANY_TAG and MPI_ANY_SOURCE are specified together, the message that is sent first is received without specifying the party or the tag. Neither MPI_ANY_TAG nor MPI_ANY_SOURCE can be specified in the send procedure.
Point-to-point communication consists of four communication modes: synchronous, buffer, ready, and standard.
The name of the send procedure used determines the specific mode to be used in a particular communication session. For all modes, the name of the receive procedure is MPI_RECV.
This section discusses the four communication modes.
The following is an example of a Fortran program that uses synchronous mode.
Example:
SUBROUTINE SUB(DATA, N) use mpi INTEGER MYRANK,IERROR,IRANK,ITAG INTEGER DATA(N) ! "STATUS" is an output argument that provides information ! about the received message ! Define the type of a handle, status INTEGER STATUS(MPI_STATUS_SIZE) ! Each process in MPI_COMM_WORLD finds out its rank CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYRANK,IERROR) ITAG=10 IF(MYRANK.EQ.0) THEN IRANK=1 ! synchronous mode operation of data to the destination process with rank = 1 CALL MPI_SSEND(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,IERROR) ENDIF IF(MYRANK.EQ.1) THEN IRANK=0 ! receive operation of data from the source process with rank = 0 CALL MPI_RECV(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,STATUS,IERROR) ENDIF RETURN END |
int sub(int *data, int n){ int myrank, irank, itag; /* "status" is an output argument that provides information about the received message */ /* Define the type of a handle, status */ MPI_Status status; /* Each process in MPI_COMM_WORLD finds out its rank */ MPI_Comm_rank(MPI_COMM_WORLD, &myrank); itag=10; if(myrank==0){ irank=1; /* synchronous mode operation of data to the destination process with rank = 1 */ MPI_Ssend(data,n,MPI_INT,irank, itag, MPI_COMM_WORLD); } if(myrank==1){ irank=0; /* receive operation of data from the source process with rank = 0 */ MPI_Recv(data,n,MPI_INT,irank, itag, MPI_COMM_WORLD, &status); } return MPI_SUCCESS; } |
The send procedure name for the buffered mode is MPI_BSEND.
Before the buffered mode communication,
a buffer area with the size necessary for communication has to be
registered with the procedure MPI_BUFFER_ATTACH.
Note that in addition to the area needed to store messages,
an additional area that has the size of the MPI constant
MPI_BSEND_OVERHEAD bytes is needed for each message
since NEC MPI's management data is also stored in the user buffer.
After the use of the buffer, deregistration
of it should be performed with the procedure MPI_BUFFER_DETACH.
The following is an example of a Fortran program that uses buffered mode
communication.
Example:
SUBROUTINE SUB(DATA,N,BUFF,M) use mpi INTEGER MYRANK, IERROR,IRANK,ITAG INTEGER DATA(N),BUFF(M),SIZE INTEGER STATUS(MPI_STATUS_SIZE) CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYRANK,IERROR) ! allocation of a buffer with the size m*4 including an extra space ! for MPI_BSEND_OVERHEAD CALL MPI_BUFFER_ATTACH(BUFF, M**4, IERROR) ITAG=10 IF(MYRANK.EQ.0) THEN IRANK=1 ! buffered mode send operation of data to the destination process ! with rank = 1 CALL MPI_BSEND(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,IERROR) ENDIF IF(MYRANK.EQ.1) THEN IRANK=0 CALL MPI_RECV(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,STATUS,IERROR) ENDIF ! deallocation of a buffer CALL MPI_BUFFER_DETACH(BUFF,SIZE,IERROR) RETURN END |
int sub(int *data, int n, int m, int *buf){ int myrank, irank, itag, size; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &myrank); /* allocation of a buffer with the size m*4 including an extra space for MPI_BSEND_OVERHEAD */ MPI_Buffer_attach(buf,m*4); itag=10; if(myrank==0){ irank=1; /* buffered mode send operation of data to the destination process with rank = 1 */ MPI_Bsend(data,n,MPI_INT,irank, itag, MPI_COMM_WORLD); } if(myrank==1){ irank=0; MPI_Recv(data, n, MPI_INT, irank, itag, MPI_COMM_WORLD, &status); } /* deallocation of a buffer */ MPI_Buffer_detach(buf,&size); return MPI_SUCCESS; } |
The following is an example of a Fortran program that uses ready mode communication. In this program, an error can occur if the send procedure for the rank-0 MPI process is called before the receive procedure for the rank-1 MPI process.
Note that the NEC MPI ready mode is the same as the standard mode, and an error does not occur if the send procedure occurs ahead of the receive procedure.
Example:
SUBROUTINE SUB(DATA, N) use mpi INTEGER MYRANK,IERROR,IRANK,ITAG INTEGER DATA(N) INTEGER STATUS(MPI_STATUS_SIZE) CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYRANK,IERROR) ITAG=10 IF(MYRANK.EQ.1) THEN IRANK=0 CALL MPI_RECV(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,STATUS,IERROR) ENDIF IF(MYRANK.EQ.0) THEN IRANK=1 ! ready mode operation of data to the destination process with rank = 1 CALL MPI_RSEND(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,IERROR) ENDIF IF(IERROR.NE.0) THEN WRITE(6,*)'ABNORMAL COMPLETED ON MPI_RSEND' ENDIF RETURN END |
int sub(int *data, int n){ int myrank, irank, itag, size; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &myrank); itag=10; if(myrank==1){ irank=0; MPI_Recv(data,n,MPI_INT,irank, itag, MPI_COMM_WORLD, &status); } if(myrank==0){ irank=1; /* ready mode send operation of data to the destination process with rank = 1 */ MPI_Rsend(data, n, MPI_INT, irank, itag, MPI_COMM_WORLD); } return MPI_SUCCESS; } |
It should be noted that programs that use a standard-mode communication procedure and that run normally when buffered mode is selected can cause a deadlock when executed in synchronous mode.
Suppose, for example, that a program sends tag values ITAG1 and ITAG2, in the indicated sequence, in a rank-0 MPI process, and receives tag values ITAG2 and ITAG1, in the indicated sequence, in a rank-1 MPI process. This program runs normally in buffer-mode communication. However, if synchronous mode is applied by implementation, both the sender and the receiver, when calling the first communication procedure, wait until the second communication procedure from the other side is called. This results in a deadlock.
Example:
SUBROUTINE SUB(DATA1, N1, DATA2, N2) use mpi INTEGER MYRANK,IERROR,IRANK,ITAG1, ITAG2 INTEGER DATA1(N1), DATA2(N2) INTEGER STATUS(MPI_STATUS_SIZE) CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYRANK,IERROR) ITAG1=10 ITAG2=20 IF(MYRANK.EQ.0) THEN IRANK=1 ! standard mode send operations of DATA1 and DATA2 to the destination ! process with rank = 1 CALL MPI_SEND(DATA1,N1,MPI_INTEGER,IRANK,ITAG1,MPI_COMM_WORLD,IERROR) CALL MPI_SEND(DATA2,N2,MPI_INTEGER,IRANK,ITAG2,MPI_COMM_WORLD,IERROR) ENDIF IF(MYRANK.EQ.1) THEN IRANK=0 ! receive operations of DATA2 and DATA1 from the source process ! with rank = 0 CALL MPI_RECV(DATA2,N2,MPI_INTEGER,IRANK,ITAG2,MPI_COMM_WORLD,STATUS,IERROR) CALL MPI_RECV(DATA1,N1,MPI_INTEGER,IRANK,ITAG1,MPI_COMM_WORLD,STATUS,IERROR) ENDIF RETURN END |
int sub(int *data1, int n1, int *data2, int n2){ int myrank, irank, itag1, itag2; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &myrank); itag1 = 10; itag2 = 20; if(myrank==0){ irank=1; /* standard mode send operations of data1 and data2 to the destination process with rank = 1 */ MPI_Send(data1,n1,MPI_INT,irank, itag1, MPI_COMM_WORLD); MPI_Send(data2,n2,MPI_INT,irank, itag2, MPI_COMM_WORLD); } if(myrank==1){ irank=0; /* receive operations of data2 and data1 from the source process with rank = 0 */ MPI_Recv(data2, n2, MPI_INT, irank, itag2, MPI_COMM_WORLD, &status); MPI_Recv(data1, n1, MPI_INT, irank, itag1, MPI_COMM_WORLD, &status ); } return MPI_SUCCESS; } |
The four communication modes described in the description are also called blocking communication modes. In point-to-point communication, there are also nonblocking communication operations for each of the four modes.
Whereas blocking communication creates and sends communication messages within the send procedure and selects and receives communication messages within the receive procedure, the nonblocking communication mode requires the execution of a communication startup procedure and a communication termination confirmation procedure. The communication startup procedure returns control to the user after receiving arguments and generating communication requests. Therefore, normal communication processes, such as generating and sending communication messages and selecting and receiving communication messages, are performed in the MPI implementation concurrently with the processing of the user's program.
Communication requests can establish an association between a communication startup procedure and a communication termination confirmation procedure. MPI enforces the end of a communication by calling the communication termination confirmation procedure whose argument is the communication request.
When using a nonblocking communication mode, the user must make sure that the send process does not update the user data to be transmitted. The user must also make sure that the receiving process does not update or reference the transmission process from the time the communication startup procedure is called until the communication termination confirmation procedure is called.
The procedure name for nonblocking communication has the letter I attached to the prefix MPI_ of the blocking communication procedure name. Thus, the name of the begin-to-receive procedure is MPI_IRECV and the name of the begin-to-send procedure is MPI_ISSEND for synchronous mode, MPI_IBSEND for buffered mode, MPI_IRSEND for ready mode, and MPI_ISEND for standard mode. The name of the procedure to wait for the completion of communication is MPI_WAIT. This name is common among all send and receive procedures in any mode.
It is not necessary that sending in blocking communication be associated with reception in blocking communication or that sending in nonblocking communication be associated with reception in nonblocking communication. Data transmitted in nonblocking mode can be received in blocking mode, or vice versa.
The following is an example of a Fortran program that performs nonblocking communication in standard mode both for send and receive.
Example:
SUBROUTINE SUB(DATA1, N1, DATA2, N2) use mpi INTEGER MYRANK, IERROR, IRANK, ITAG INTEGER ISTAT(MPI_STATUS_SIZE) ! Define the type of two handles, IREQ0 and IREQ1 INTEGER IREQ0, IREQ1 DOUBLE PRECISION DATA1(N1), DATA2(N2) CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK,IERROR) ITAG1=10 IF(MYRANK.EQ.0) THEN IRANK=1 ! nonblocking standard mode send operation of DATA1 to the process with rank=1 CALL MPI_ISEND(DATA1,N1,MPI_INTEGER, IRANK,ITAG,MPI_COMM_WORLD,IREQ0,IERROR) ENDIF IF(MYRANK.EQ.1) THEN IRANK=0 ! nonblocking standard mode receive operation of DATA1 from the process with rank=0 CALL MPI_IRECV(DATA1,N1,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,IREQ1,IERROR) ENDIF DO I=1,N2 DATA2(I)=SQRT(REAL(I)) ENDDO IF(MYRANK.EQ.0) THEN ! wait until the send operation completes CALL MPI_WAIT(IREQ0,ISTAT,IERROR) ENDIF IF(MYRANK.EQ.1) THEN ! wait until the receive operation completes CALL MPI_WAIT(IREQ1,ISTAT,IERROR) ENDIF RETURN END |
int sub(double *data1, int n1, double *data2, int n2){ int myrank, irank, itag; /* Define the type of two handles, ireq0 and ireq1 */ MPI_Request ireq0, ireq1; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &myrank); itag=10; if(myrank==0){ irank=1; /* nonblocking standard mode send operation of data1 to the process with rank=1 */ MPI_Isend(data1,n1,MPI_INT,irank, itag, MPI_COMM_WORLD, &ireq0); } if(myrank==1){ irank=0; /* nonblocking standard mode receive operation of data1 from the process with rank=0 */ MPI_Irecv(data1,n1,MPI_INT,irank, itag, MPI_COMM_WORLD, &ireq1); } for(int i=0;n2>i;i++)data2[i]=sqrt((double) (i+1)); /* wait until the send operation completes */ if(myrank == 0)MPI_Wait(&ireq0,&status); /* wait until the receive operation completes */ if(myrank == 1)MPI_Wait(&ireq1,&status); return MPI_SUCCESS; } |
Prior Confirmation of Communication Messages
In point-to-point communication, before calling the receive procedure the program can verify whether or not a target communication message has been sent and is ready to be received. The procedures MPI_PROBE and MPI_IPROBE are used to interrogate communication messages. The status of a target communication message can be ascertained by specifying the communicator and the rank value for the transmission MPI process and the tag value of the communication message in arguments. The procedure MPI_PROBE waits until the target communication message can be received. The procedure MPI_IPROBE, on the other hand, only returns a receive ready or not ready status without waiting for the message.
2.5.2   Collective Operations
In a collective operation, all MPI processes belonging to a group associated
with a given communicator call a collective operation procedure in order
to perform communication, computation and/or execution
controls. Collective routines all specify a communicator as argument.
The collective operations are applicable to an intra-communicator and inter-communicator.
This section explains synchronous control, broadcasting communication, and GATHER/SCATTER communication, which are principal functions in the collective operations.
The name of the procedure for synchronization is MPI_BARRIER. Each MPI process in the group that calls this procedure waits until all other MPI processes in the communicator specified in the argument list have also invoked the procedure. When an inter-communicator is specified as the communicator argument, the procedure MPI_BARRIER returns only after all MPI processes, in both local and remote groups.
The following is an example of a Fortran program that uses synchronous control (barrier). In this example, an intra-communicator MPI_COMM_WORLD is used.
Example:
SUBROUTINE SUB(DATA, N) use mpi INTEGER MYRANK,IERROR,IRANK,ITAG INTEGER DATA(N) INTEGER IREQ INTEGER STATUS(MPI_STATUS_SIZE) CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYRANK,IERROR) ITAG=10 IF(MYRANK.EQ.1) THEN IRANK=0 CALL MPI_IRECV(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,IREQ,IERROR) ENDIF ! Ensure that all processes in the communicator have completed ! the MPI_Irecv operation CALL MPI_BARRIER(MPI_COMM_WORLD,IERROR) IF(MYRANK.EQ.0) THEN IRANK=1 CALL MPI_RSEND(DATA,N,MPI_INTEGER,IRANK,ITAG,MPI_COMM_WORLD,IERROR) ENDIF IF(MYRANK.EQ.1) THEN CALL MPI_WAIT(IREQ,STATUS,IERROR) ENDIF RETURN END |
int sub(int *data, int n){ int myrank, irank, itag; MPI_Request ireq; MPI_Status status; MPI_Comm_rank(MPI_COMM_WORLD, &myrank); itag=10; if(myrank==1){ irank=0; MPI_Irecv(data,n,MPI_INT,irank, itag, MPI_COMM_WORLD, &ireq); } /* Ensure that all processes in the communicator have completed the MPI_Irecv operation */ MPI_Barrier(MPI_COMM_WORLD); if(myrank==0){ irank=1; MPI_Isend(data,n,MPI_INT,irank, itag, MPI_COMM_WORLD, &ireq); } if(myrank == 1)MPI_Wait(&ireq,&status); return MPI_SUCCESS; } |
The name of the procedure for broadcast communication is MPI_BCAST. This procedure takes data area, data type, data size, and rank as the arguments in addition to the communicator argument. All MPI processes that call the procedure must specify the same value as the argument rank. In broadcasting communication, the MPI process that has the specified rank becomes the root process that sends data. All other MPI processes receive data.
The root process means the unique sender or receiver in a collective operation. The root process is used not only in broadcasting communication, but also in GATHER communication and SCATTER communication described in the next section.
When an inter-communicator is specified as the argument communicator, the root process exists in either of two groups in the inter-communicator. Data in the root process is transferred to all MPI processes in the remote group.
The following is an example of a Fortran program that uses broadcasting communication. In this example, the rank-0 MPI process specified in the rank value argument is the root process that sends DATA, and all other MPI processes receive DATA.
Example:
SUBROUTINE SUB(DATA, N) use mpi INTEGER IERROR,IROOT,MYRANK INTEGER DATA(N) CALL MPI_COMM_RANK(MPI_COMM_WORLD,MYRANK,IERROR) IF(MYRANK.EQ.0) THEN DO I = 1,N DATA(I) = I ENDDO ENDIF ! broadcast send operation of data from the process with rank=0 to ! all other processes in MPI_COMM_WORLD IROOT = 0 CALL MPI_BCAST(DATA,N,MPI_INTEGER,IROOT,MPI_COMM_WORLD,IERROR) RETURN END |
int sub(int *data, int n){ int myrank, iroot; MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if(myrank == 0){ for(int i=0;n>i;i++)data[i]=i; } iroot = 0; /* broadcast send operation of data from the process with rank=0 to all other processes in MPI_COMM_WORLD */ MPI_Bcast(data, n, MPI_INT, iroot, MPI_COMM_WORLD); return MPI_SUCCESS; } |
Gather Communication and Scatter Communication
The procedure MPI_GATHER gathers data from all MPI processes into the root process. The procedure MPI_SCATTER distributes data from the root process to all MPI processes. The SCATTER communication differs from the broadcasting communication in that, whereas the broadcasting communication broadcasts the same data to all MPI processes in the communicator, the SCATTER communication transmits different parts of the send buffer to respective different MPI processes.
When an inter-communicator is specified as the argument communicator, data-gathering operation in the procedure MPI_GATHER is performed within the group where the root process does not exist in the inter-communicator. The result is transferred to the root process in the remote group. In the procedure MPI_SCATTER, data on the root process is distributed to MPI processes in the remote group.
The following is an example of a Fortran program that uses the procedure MPI_GATHER, where N denotes the number of elements in the array ALLDATA, and M denotes the number of elements in the array PARTDATA. The value of N must be greater than or equal to that of M * "the number of processes in the communicator MPI_COMM_WORLD".
Example:
SUBROUTINE SUB(ALLDATA, N, PARTDATA, M) use mpi INTEGER IERROR,IROOT INTEGER ALLDATA(N),PARTDATA(M) ! gather operation of the buffer PARTDATA from all processes in MPI_COMM_WORKLD ! to the buffer ALLDATA on the root process with rank=0 IROOT=0 CALL MPI_GATHER(PARTDATA,M,MPI_INTEGER,ALLDATA,M,MPI_INTEGER, IROOT, & MPI_COMM_WORLD,IERROR) RETURN END |
int sub(int *alldata, int n, int *partdata, int m){ int iroot; iroot = 0; /* gather operation of the buffer partdata from all processes in MPI_COMM_WORKLD to the buffer alldata on the root process with rank=0 */ MPI_Gather(partdata, m, MPI_INT, alldata, m, MPI_INT,iroot, MPI_COMM_WORLD); return MPI_SUCCESS; } |
The following is an example of a Fortran program that uses the procedure MPI_SCATTER, where N denotes the number of elements in the array ALLDATA, and M denotes the number of elements in the array PARTDATA. The value of N must be greater than or equal to that of M * "the number of processes in the communicator MPI_COMM_WORLD".
Example:
SUBROUTINE SUB(ALLDATA, N, PARTDATA, M) use mpi INTEGER IERROR,IROOT INTEGER ALLDATA(N),PARTDATA(M) ! scatter operation of the buffer ALLDATA from the root process with rank=0 ! to the buffers PARTDATA on all processes in MPI_COMM_WORKLD IROOT=0 CALL MPI_SCATTER(ALLDATA,M,MPI_INTEGER,PARTDATA,M,MPI_INTEGER, & IROOT, MPI_COMM_WORLD,IERROR) RETURN END |
int sub(int *alldata, int n, int *partdata, int m){ int iroot; iroot = 0; /* scatter operation of of the buffer alldata from the root process with rank=0 to buffers partdata on all processes in MPI_COMM_WORKLD */ MPI_Scatter(partdata, m, MPI_INT, alldata, m, MPI_INT, iroot, MPI_COMM_WORLD); return MPI_SUCCESS; } |
The collective operations explained in the previous sections are blocking operations, which perform communication and calculations, and return control to the user program after these operations are completed. Collective operations have nonblocking variants as well as point-to-point communication. The procedure names for nonblocking communication have the letter I attached immediately after the prefix MPI_ of the blocking communication procedure name; for example, the procedure MPI_IBCAST performs nonblocking broadcasting.
A nonblocking call initiates a collective operation, and may return control to the user program before the operation completes. A communication request is made and returned by the nonblocking call, which is used to wait for completion of the operation. Nonblocking collective operations can be used in the same way as point-to-point communication, but there are some differences described below.
NEC MPI supports offloading MPI collectives to InfiniBand network by using the SHARP feature provided by HPC-X software package of NVIDIA Corporation. This offloading feature allows improvements on performance and scalability of MPI collective communication procedures listed below, if MPI_COMM_WORLD or a communicator that is composed of the same MPI process group as MPI_COMM_WORLD is used.
2.5.3   One-Sided Communication
The one-sided communication is a set of operations in which cooperation with another process is not explicitly required for the transfer of data, that is, an MPI process can alone transfer data from/to another process.
In order to perform one-sided communication operation, an MPI process has to register its memory region to which the MPI process allows the other process to get access. The region is called a window. The procedures MPI_WIN_CREATE, MPI_WIN_ALLOCATE, MPI_WIN_ALLOCATE_SHARED, and MPI_WIN_CREATE_DYNAMIC are used to register a window region. If the procedure MPI_WIN_ALLOCATE_DYNAMIC is used, a separate call to the procedure MPI_WIN_ATTACH is required to attach memory region to the window. The procedure MPI_WIN_FREE frees the window. These procedures are collective operations.
One-Sided Communication Procedures
The following procedures perform one-sided communication:
The procedure MPI_GET reads out data from a window on a target process. The procedure MPI_PUT writes data to a window on a target process. The procedure MPI_ACCUMULATE is similar to the procedure MPI_PUT, but can perform a specified operation instead of the simple update. The procedure MPI_GET_ACCUMULATE also performs the specified operation and updates the target window, but the contents in the target window before update are returned to the origin process. The procedure MPI_FETCH_AND_OP performs the same operation as the procedure MPI_GET_ACCUMULATE, and for this procedure, only one element can be specified. The procedure MPI_COMPARE_AND_SWAP replaces the target window with the origin buffer, if the comparison buffer is the same as the target buffer. The target buffer is always returned to the origin process.
The procedures MPI_RGET, MPI_RPUT, MPI_RACCUMULATE, and MPI_RGET_ACCUMULTE perform the same operations as the procedures MPI_GET, MPI_PUT, MPI_ACCUMULTE, and MPI_GET_ACCUMULTE, respectively, and return MPI communication requests. A completion call, such as the procedures MPI_WAIT and MPI_TEST, can be used to complete the operation. A synchronization call for one-sided communication, described below, completes the communication initiated by these procedures, but a completion call is still required.
Three kinds of synchronization mechanisms are provided to synchronize one-sided communication.
The procedures MPI_WIN_POST, MPI_WIN_START, MPI_WIN_FENCE, MPI_WIN_LOCK and MPI_WIN_LOCK_ALL can take assertion arguments, which tell NEC MPI the context in which they are called. Valid assertions are as follows:
MPI_WIN_POST | ||
---|---|---|
assertion | meaning | NEC MPI |
MPI_MODE_NOCHECK | The corresponding calls to the procedure MPI_WIN_START have not yet occurred. MPI_MODE_NOCHECK has to be specified for both procedures MPI_WIN_POST and MPI_WIN_START. | used |
MPI_MODE_NOSTORE | After the previous synchronization, the local window is not updated by local stores. | ignored |
MPI_MODE_NOPUT | After the post call, the local window is not updated by the procedures MPI_PUT and MPI_ACCUMULATE. | used |
MPI_WIN_START | ||
assertion | meaning | NEC MPI |
MPI_MODE_NOCHECK | The corresponding calls to the procedure MPI_WIN_POST have already completed. MPI_MODE_NOCHECK has to be specified for both procedures MPI_WIN_POST and MPI_WIN_START. | used |
MPI_WIN_FENCE | ||
assertion | meaning | NEC MPI |
MPI_MODE_NOSTORE | After the previous synchronization, the local window is not updated by local store. | ignored |
MPI_MODE_NOPUT | The local window will not be updated by the procedures MPI_PUT and MPI_ACCUMULATE until the next synchronization. | used |
MPI_MODE_NOPRECEDE | The synchronization does not complete any locally issued communication operations. MPI_MODE_NOPRECEDE has to be specified by all processes. | used |
MPI_MODE_NOSUCCEED | After the synchronization, no communication operations will be issued. MPI_MODE_NOSUCCEED has to be specified by all processes. | used |
MPI_MODE_NEC_GETPUTALIGNED | Asserts that all one-sided communication operations were via direct memory copy, see the description. | used |
MPI_WIN_LOCK, MPI_WIN_LOCK_ALL | ||
assertion | meaning | NEC MPI |
MPI_MODE_NOCHECK | No other process holds or will attempt to get a conflicting lock, while calling process holds the lock. | used |
Example of One-Sided Communication
The following C program fragment shows an example using the one-sided communication. In this example, data of a root process expressed by argument root is transferred to other processes in the communicator comm. The procedure MPI_GET is used for data-transfer and the procedure MPI_FENCE is used for the synchronization.
Example:
SUBROUTINE PROPAGATE(BUFF,N) use mpi INTEGER ROOT, IERROR, RANK, DISP ! Define the type of a window object, WIN INTEGER WIN INTEGER BUFF(N) INTEGER(KIND=MPI_ADDRESS_KIND) OFFSET OFFSET=8 ROOT=0 CALL MPI_COMM_RANK(MPI_COMM_WORLD, RANK, IERROR) ! Create memory window for process root and all other processes in the communicator IF(RANK==ROOT) THEN CALL MPI_WIN_CREATE(BUFF, OFFSET*N, DISP, MPI_INFO_NULL, MPI_COMM_WORLD, & WIN, IERROR) ELSE CALL MPI_WIN_CREATE(MPI_BOTTOM, OFFSET*0, DISP, MPI_INFO_NULL, MPI_COMM_WORLD, & WIN, IERROR) ENDIF ! All other processes get data from process root CALL MPI_WIN_FENCE(MPI_MODE_NOPRECEDE, WIN, IERROR) IF(RANK/=ROOT) THEN CALL MPI_GET(BUFF,N, MPI_INT, ROOT, OFFSET*0, N, MPI_INT, WIN, IERROR) ENDIF ! Complete the one-sided communication operation CALL MPI_WIN_FENCE(MPI_MODE_NOSUCCEED, WIN, IERROR) ! Free the window CALL MPI_WIN_FREE(WIN, IERROR) RETURN END SUBROUTINE PROPAGATE |
int propagate(int *buf, int n, int root, MPI_Comm comm){ /* Define the type of a window object, win */ MPI_Win win; int unit = sizeof(int); /* displacement unit of the window */ int rank; MPI_Comm_rank(comm, &rank); /* Create memory window for process root and all other processes in the communicator */ if (rank == root) MPI_Win_create(buf, unit * n, unit, MPI_INFO_NULL, comm, &win); else MPI_Win_create(NULL, 0, unit, MPI_INFO_NULL, comm, &win); /* All other processes get data from process root */ MPI_Win_fence(MPI_MODE_NOPRECEDE, win); if (rank != root) MPI_Get(buf, n, MPI_INT, root, 0, n, MPI_INT, win); /* Complete the one-sided communication operation */ MPI_Win_fence(MPI_MODE_NOSUCCEED, win); /* Free the window */ MPI_Win_free(&win); return MPI_SUCCESS; } |
2.6   Parallel I/O
2.6.1   Overview
MPI defines a parallel I/O, called MPI-IO, so that MPI processes can get access to the same file concurrently. The MPI-IO enables highly optimized I/O operations by, for example, expressing file access pattern for each MPI process using MPI datatypes, and providing collective I/O operations.
In addition, even if data to be accessed by MPI processes is actually stored non-contiguously in a file, the data can be looked as a sequential stream at I/O operations without any awareness about data location as well as a case of MPI communication.
2.6.2   View
In the case that file I/O operations will be performed by using the MPI-IO each MPI process has to declare in advance which segments of a file the process will get access to. The segment to be accessed is called a view. The view is expressed by an MPI data type, a basic or a derived data type, and it can express various access patterns easily.
2.6.3   Principal Procedures
In order to use the MPI-IO facility, the following operations must be performed.
The MPI-IO provides various data access procedures, and an independent procedure exists for every combination of the following aspects.
Positioning | Synchronism | Coordination | ||
---|---|---|---|---|
Non-Collective | Collective | |||
Explicit Offset | Blocking | MPI_FILE_READ_AT MPI_FILE_WRITE_AT |
MPI_FILE_READ_AT_ALL MPI_FILE_WRITE_AT_ALL | |
Nonblocking | MPI_FILE_IREAD_AT MPI_FILE_IWRITE_AT |
MPI_FILE_IREAD_AT_ALL MPI_FILE_IWRITE_AT_ALL | ||
Split Collective | N/A | MPI_FILE_READ_AT_ALL_BEGIN MPI_FILE_READ_AT_ALL_END MPI_FILE_WRITE_AT_ALL_BEGIN MPI_FILE_WRITE_AT_ALL_END | ||
Individual File Pointers | Blocking | MPI_FILE_READ MPI_FILE_WRITE |
MPI_FILE_READ_ALL MPI_FILE_WRITE_ALL | |
Nonblocking | MPI_FILE_IREAD MPI_FILE_IWRITE |
MPI_FILE_IREAD_ALL MPI_FILE_IWRITE_ALL |
||
Split Collective | N/A | MPI_FILE_READ_ALL_BEGIN MPI_FILE_READ_ALL_END MPI_FILE_WRITE_ALL_BEGIN MPI_FILE_WRITE_ALL_END | ||
Shared File Pointer | Blocking | MPI_FILE_READ_SHARED MPI_FILE_WRITE_SHARED |
MPI_FILE_READ_ALL_ORDERED MPI_FILE_WRITE_ALL_ORDERED | |
Nonblocking | MPI_FILE_IREAD_SHARED MPI_FILE_IWRITE_SHARED |
N/A | ||
Split Collective | N/A | MPI_FILE_READ_ALL_ORDERED_BEGIN MPI_FILE_READ_ALL_ORDERED_END MPI_FILE_WRITE_ALL_ORDERED_BEGIN MPI_FILE_WRITE_ALL_ORDERED_END |
2.6.4   File Name Prefixes in the Procedure MPI_FILE_OPEN
NEC MPI determines file system types using standard C
library functions by default. Application programs can explicitly specify
a file system type using a prefix in the file name given to the procedure
MPI_FILE_OPEN.
NEC MPI currently accepts the
prefixes "nfs:" and "stfs:", which indicate
NFS and ScaTeFS, respectively.
2.6.5   Example Program
The following C program fragment shows an example using the MPI-IO. This program uses procedures using the shared file pointer, MPI_FILE_WRITE_SHARED, and it appends data to a file associated with argument filename.
Example:
PROGRAM MAIN use mpi IMPLICIT NONE INTEGER :: IERR INTEGER :: SBUF (4) INTEGER :: RBUF(16) INTEGER :: N INTEGER :: MYRANK INTEGER :: NPROCS ! Define the type of a file handle, FH INTEGER :: FH INTEGER(8) :: DISP = 0 INTEGER(8) :: OFFSET = 0 CALL MPI_INIT(IERR) CALL MPI_COMM_RANK(MPI_COMM_WORLD, MYRANK, IERR) CALL MPI_COMM_SIZE(MPI_COMM_WORLD, NPROCS, IERR) SBUF = MYRANK RBUF = -1 N = 4 print *, MYRANK, ": SBUF =>", SBUF ! Open a common file with a name "output.dat" CALL MPI_FILE_OPEN(MPI_COMM_WORLD, "output.dat", MPI_MODE_RDWR+MPI_MODE_CREATE, & MPI_INFO_NULL, FH, IERR) ! Set a process's file view CALL MPI_FILE_SET_VIEW(FH, DISP, MPI_INT, MPI_INT,"native", MPI_INFO_NULL, IERR) ! Write data starting from the current location of the shared file pointer CALL MPI_FILE_WRITE_SHARED(FH, SBUF, N, MPI_INT, MPI_STATUS_IGNORE, IERR) ! Cause all previous writes to be transferred to the storage device CALL MPI_FILE_SYNC(FH, IERR) IF(MYRANK == 0) THEN ! Move an individual file pointer to the location in the file from which ! a process needs to read data CALL MPI_FILE_SEEK(FH, OFFSET, MPI_SEEK_SET, IERR) ! Read data from the current location of the individual file pointer in the file CALL MPI_FILE_READ(FH, RBUF, N*min(NPROCS,4), MPI_INT, MPI_STATUS_IGNORE, IERR) print *, MYRANK, ": RBUF =>", RBUF ENDIF ! Close the common file CALL MPI_FILE_CLOSE(FH, IERR) CALL MPI_FINALIZE(IERR) END PROGRAM |
#include "mpi.h" #include <stdio.h> int main(int argc, char *argv[]) { int sbuf[4]; int rbuf[16]; int n; int myrank; int nprocess; int disp = 0; int offset = 0; int tmp; /* Define the type of a file handle, fh */ MPI_File fh; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &nprocess); for(int i=0; i<4; i++) sbuf[i] = myrank; for(int i=0; i<16; i++) rbuf[i] = -1; n = 4; printf("%d:sbuf=>", myrank); for(int i=0; i<4; i++) printf("%d ", sbuf[i]); printf("\n"); /* Open a common file with a name "output.dat" */ MPI_File_open(MPI_COMM_WORLD, "output.dat", MPI_MODE_RDWR|MPI_MODE_CREATE, MPI_INFO_NULL, &fh); /* Set a process's file view */ MPI_File_set_view(fh, disp, MPI_INT, MPI_INT, "native", MPI_INFO_NULL); /* Write data starting from the current location of the shared file pointer */ MPI_File_write_shared(fh, sbuf, n, MPI_INT, MPI_STATUS_IGNORE); /* Cause all previous writes to be transferred to the storage device */ MPI_File_sync(fh); if (myrank==0) { printf("%d READ ALL FROM FILE\n", myrank); /* Move an individual file pointer to the location in the file from which a process needs to read data */ MPI_File_seek(fh, offset, MPI_SEEK_SET); if (nprocess < 4) { tmp = nprocess; } else { tmp = 4; } /* Read data from the current location of the individual file pointer in the file */ MPI_File_read(fh, rbuf, n*tmp, MPI_INT, MPI_STATUS_IGNORE); printf("%d: rbuf=>", myrank); for(int i=0; i<16; i++) printf("%d ", rbuf[i]); printf("\n"); } /* Close the common file */ MPI_File_close(&fh); MPI_Finalize(); return 0; } |
2.6.6   Acceleration of Parallel I/O
The parallel I/O performance can be improved by using ScaTeFS VE direct IB Library or Accelerated I/O function when executing MPI programs on VE. The usage conditions and usage examples are as follows.
$ export VE_LD_PRELOAD=libscatefsib.so.1 $ mpirun -np 1 ./a.out |
$ export VE_ACC_IO=1 $ mpirun -np 1 ./a.out |
2.7   Thread Execution Environment
MPI defines levels of thread support as follows;
If processes are multi-threaded, the processes use the procedure MPI_INIT_THREAD instead of the procedure MPI_INIT for the initialization of MPI. In addition to the arguments of the procedure MPI_INIT, the procedure MPI_INIT_THREAD requires an argument to get the available thread support level.
The thread support level in NEC MPI is MPI_THREAD_SERIALIZED. Therefore, all threads in an MPI process can make calls to MPI procedures. However, users must ensure that two or more threads never call MPI procedures at the same time. And, if synchronization among threads is performed with, for example, a lock file or a shared variable instead of the standard synchronization methods provided by the shared-memory parallel processing such as OpenMP or the POSIX thread standard, cache coherency may not be guaranteed among threads.
2.8   Error Handler
MPI defines an error handling mechanism that invokes a specified procedure (error handler) when an MPI procedure detects an error such as "invalid arguments".
An error handler can be attached to each communicator, to each file handle used for the MPI-IO, and to each window used for the one-sided communication, and the error handler is called when an error occurs in MPI communication.
In order to attach and use error handlers, they have to be registered. The procedure MPI_XXX_CREATE_ERRHANDLER registers an error handler where XXX is COMM for a communicator, WIN for a window and FILE for a file handler. And then the procedure MPI_XXX_SET_ERRHANDLER attaches the registered error handler. The procedure MPI_XXX_GET_ERRHANDLER is used to get the error handler attached.
2.9   Attribute Caching
MPI provides a caching facility that allows MPI processes to attach arbitrary attributes to communicators, windows, and datatypes.
Before using attributes, it is necessary to create and register attribute keys with the procedure MPI_XXX_CREATE_KEYVAL, where XXX is COMM for a communicator, WIN for a window and TYPE for a datatype, the procedure MPI_XXX_SET_ATTR attaches information, the procedure MPI_XXX_GET_ATTR gets the cached information, and the procedure MPI_XXX_DELETE_ATTR removes the cached information away.
2.10   Fortran Support
NEC MPI provides the following Fortran support methods.
2.11   Info Object
The Info Object, MPI_INFO, is used, for example, to give a sort of optimization information to an MPI procedure. The MPI_INFO object consists of a pair of character strings; one is a key and the other is a value corresponding to the key. An MPI_INFO object is an opaque object and is accessed via a handle.
The procedure MPI_INFO_CREATE creates a new MPI_INFO object and returns a handle to the object. Here the object is empty. The procedure MPI_INFO_SET sets a key-value pair (key, value) to the object, and the procedure MPI_INFO_GET procedure retrieves a pair (key, value) out of the object. The procedure MPI_INFO_DELETE removes information from the object. Other procedures handling MPI_INFO are as follows; The procedure MPI_INFO_GET_VALUE_LEN returns a value string length, the procedure MPI_INFO_NKEYS returns the number of currently defined keys, the procedure MPI_INFO_NTHKEY returns the n-th key, the procedure MPI_INFO_DUP duplicates an object, and the procedure MPI_INFO_FREE frees an object and makes its handle MPI_INFO_NULL.
One example of the MPI_INFO object use is to provide optimization information for MPI-IO operations. In MPI-IO, an MPI_INFO object may be used to inform the MPI-IO system of the file access pattern, the characteristic of an underlying file system or storage device, or the internal buffer size suitable for an application, a storage device, or a file system.
The following key values of the MPI_INFO object are currently interpreted by NEC MPI:
Note: ip_address and ip_port are used in the same manner as the bind(2-B) function.
2.12   Language Interoperability
In the case that an MPI program is implemented in both Fortran and C/C++
languages (language-mix), a handle for MPI objects
(see the table) must
be translated, if the MPI object is shared by Fortran routines and
C/C++ routines. In order to translate a handle for MPI objects, C
binding routines are provided. There are no corresponding Fortran
bindings.
For example, the following procedure translates a handle for an MPI
communicator used in C language into a handle that can be used in
Fortran language.
MPI_Fint MPI_COMM_C2F(MPI_Comm c_comm) |
where c_comm refers to a communicator's handle used in C language and
the return value of the procedure MPI_COMM_C2F can be used for a handle the can
be used in Fortran language (MPI_Fint refers to a datatype
corresponding to INTEGER type in Fortran).
The following procedure translates a handle for Fortran language to
that for C/C++.
MPI_Comm MPI_COMM_F2C(MPI_Fint f_comm) |
When MPI communication status (MPI_Status) is translated, the following
procedures are used, that takes pointer arguments to the translation
target and the result.
int MPI_STATUS_C2F(MPI_Status *c_status, MPI_Fint *f_status) int MPI_STATUS_F2C(MPI_Fint *f_status, MPI_Status *c_status) |
Table 2-3 shows a list for MPI objects and corresponding procedures for translating handles.
|
|
|
|
| | ||
Communicator | MPI_Comm | MPI_COMM_C2F | MPI_COMM_F2C |
Datatype | MPI_Dattype | MPI_TYPE_C2F | MPI_TYPE_F2C |
Group | MPI_Group | MPI_GROUP_C2F | MPI_GROUP_F2C |
Request | MPI_Request | MPI_REQUEST_C2F | MPI_REQUEST_F2C |
File | MPI_File | MPI_FILE_C2F | MPI_FILE_F2C |
Window | MPI_Win | MPI_WIN_C2F | MPI_WIN_F2C |
Reduction Operation | MPI_Op | MPI_OP_C2F | MPI_OP_F2C |
Info | MPI_Info | MPI_INFO_C2F | MPI_INFO_F2C |
Status | MPI_Status | MPI_STATUS_C2F | MPI_STATUS_F2C |
Message | MPI_Message | MPI_MESSAGE_C2F | MPI_MESSAGE_F2C |
2.13   Nonblocking MPI Procedures in Fortran Programs
When any of the following array arguments is passed as a communication buffer
or I/O buffer of a nonblocking MPI procedure in a Fortran program,
The Fortran compiler would allocate a work array, copy the argument into the
work array, and pass the work array to the MPI procedure in order to
guarantee consecutiveness of array elements.
If any of these arrays is passed to a nonblocking MPI procedure, users must explicitly allocate a temporary array and use it as the actual argument as shown in the following examples.
real :: A(1000) integer :: request call MPI_ISEND(A+10.0, ..., request, ...) ! array expression, : ! (not permitted here) : call MPI_WAIT(request, ...) should be rewritten real :: A(1000) real,allocatable :: tmp(:) integer :: request allocate(tmp(1000)) ! allocate user work array tmp = A + 10.0 call MPI_ISEND(tmp, ..., request, ...) ! MPI call with the user work array : : call MPI_WAIT(request, ...) deallocate(tmp) ! frees the user work array |
Figure 2-1 An Example of User Work Array (for Send Operation)
real :: A(1000) integer :: request call MPI_IRECV(A(1:1000:2), ...., request, ...) ! array section, : ! (not permitted here) : call MPI_WAIT(request, ...) should be rewritten real :: A(1000) real,allocatable :: tmp(:) integer :: request allocate(tmp(500)) ! allocate user work array call MPI_IRECV(tmp, ..., request, ....) ! MPI call with the user work array : : call MPI_WAIT(request, ...) A(1:1000:2) = tmp ! copies the user work array ! to a target array deallocate(tmp) ! frees the user work array |
Figure 2-2 An Example of User Work Array (for Receive Operation)
Contents | Previous Chapter | Next Chapter | Glossary | Index |