Hands-On Session Advanced Messaging #2:
MPI Collective Communications

Objective: To gain experience in the use of MPI collective communications.
The code for this exercise can be found here.
Instructions on how to log into the remote machine and how to download the source code to your working directory (using wget) can be found here.

  1. The program heat.c used in the previous session used only point-to-point communications. There are however places in the codes where these have been used for broadcasts and an all-reduction. Identify these places and replace the corresponding code sections with the respective MPI collective communication calls. Test your code to make sure it works, e.g. by
    mpirun -np 8 ./heat < heat.input

  2. Re-run the IPM profiling exercise of session AM1. Use the batch file for more accurate results. Now which MPI function call is dominant?

  3. The program mpiAllGathRedScatt n performs an all-gather and reduce-scatter on buffers of n words. It has three versions of these algorithms:
    1. using MPI_Allgather() and MPI_Reduce_scatter()
    2. using MPI_Gather(...);MPI_Bcast() and MPI_Reduce(...);MPI_Scatter(...)
      You may choose process 0 as the root for the gather / reduction. For the latter, process 0 can use gbuf as the source and destination, provided it specifies MPI_IN_PLACE for the sbuf parameter.
    3. using (naive) point-to-point messaging versions of the above.
    Compile and run the program for various n, e.g. mpirun -np 8 ./mpiAllGathRedScatt 1000. You will notice that it only prints out version 0 timings, as the other timings are so far yet to be implemented (your task!).

    Implement the version 1 algorithms and test. When satisfied, use the batch file. Which versions was better, and on which values of n and the number of processes?

    Time permitting, add the code for version 2 and repeat. Hint: avoid sending to self - may cause subtle deadlock issues. memcpy() can be used instead.

    The synopsis of the MPI functions for versions 1 and 2 is:

    int MPI_Bcast(void *buf, int count, MPI_Datatype dt, 
                  int root, MPI_Comm comm);
    int MPI_Reduce(const void *sbuf, void *rbuf, int count, 
                   MPI_Datatype dt, MPI_Op op, int root, MPI_Comm comm);
    int MPI_Scatter(const void *sbuf, int scount, MPI_Datatype sdt,
                    void *rbuf, int rcount, MPI_Datatype rdt,
                    int root, MPI_Comm comm);
    int MPI_Gather(const void *sbuf, int scount, MPI_Datatype sdt,
                   void *rbuf, int rcount, MPI_Datatype rdt,
                   int root, MPI_Comm comm);
    int MPI_Send(void *buf, int count, MPI_Datatype datatype,
                 int dest, int tag, MPI_Comm comm);
    int MPI_Recv(void *buf, int count, MPI_Datatype datatype,
                 int source, int tag, MPI_Comm comm,
                 MPI_Status *status);