CECS Home | ANU Home | Search ANU
The Australian National University
ANU College of Engineering and Computer Science (CECS)
Department of Computer Science
Printer Friendly Version of this Document
High Performance Scientific Computing COMP4300
COMP4300 Assignment 1

COMP4300 Assignment 1

Version: 10:00 Tuesday 21 April 2009 (submission and deadline revised)

This assignment is worth 20% of the total course mark.

Deadline: 5pm Sunday 26th April 2009

Late penalty

  • < 6 hours 1 mark from 20
  • < 12 hours 2 marks from 20
  • < 24 hours 4 marks from 20
  • < 48 hours 8 marks from 20
  • < 72 hours 12 marks from 20
  • > 72 hours 20 marks from 20

Submission

Will be a tar file via streams. Execute the instruction
            submit comp4300 Ass1 assign1.tar
from the student system.

Objective

Your task is to develop an MPI parallel implementation of red/blue flow simulation program. The basic idea is taken from Lin and Snyder (see Chapter 4, Exercise 10 on page 125). You have a grid of size Nx by Ny. Each grid point has a colour of red, blue or white. Empty grid points are white. There are NRED and NBLUE grid points that are randomly selected to be red and blue respectively.

The computation proceeds in phases. In each phase a red grid point moves to the right if the adjacent grid point is white, while the blue grid points move down if their adjacent grid point is white. After a coloured grid point has moved, the vacated grid point becomes white. Red grid points are updated before blue grid points, thus a blue grid point can move into a grid point vacated by a red moving in the same phase of the computation. The grid is periodic, so a red grid point exiting the grid to the right re-enters the grid on the left. Similarly blue grid points exiting at the bottom re-enter at the top of the grid. The red and blue updates within a single phase are shown below.

For any grid point a neighbouring grid point is defined as one for which the X and Y coordinates differ by less than some defined value Delta (shown below for delta=1).

After each update phase the program computes a histogram of the number of non-white neighbours that each red and blue grid point has. From the histogram the average number of neighbours, the standard deviation, and the maximum number of neighbours is evaluated and printed out for that step of the computation.

The computation continues for a maximum number of steps, OR until 60 seconds elapsed time has been exhausted.

A tar file containing a sequential version of the code is available here

Requirements

You are required to parallelise the code using MPI ensuring that the results produced are identical to the sequential code. (There is a random number generator used initially, you need to ensure the starting configuration is the same as for the provided code regardless of the number of processes used to run your code.)

Your challenge is to make the code run as fast as possible on the Saratoga cluster, within a 60 seconds elapsed time limit. Your code will be assessed as follows:

  • Basic test: eg Nx = Ny, N_Red=N_Blue=[0.05Nx,0.2Nx], Delta=[1,2]
  • Special cases: eg Nx [>>,<<] Ny, [N_Red,N_Blue]=0-Nx*Ny, Delta=[0,Nx,Ny,2Nx,2Ny]
  • Large case: Nx = (0.1-10.0) Ny, [N_Red,N_Blue]=(0.01-0.2)Nx*Ny, Delta=1-5
  • Challenge case: Nx = [0.1-10.0] Ny, N_Red=N_Blue<=1,000,000, Delta=1-5. Most iterations performed within 60seconds.
I make no claim that the provided sequential code is in anyway optimal. So as well as parallelising the code using MPI, you should also consider possible alternative implementations. You will be free to do what you like, provided that the output is identical format to that given by the sequential code.

What you need to submit

You should submit a tar file called "assign1.tar", which should untar to contain the following
  • A README file detailing the content of the tar file.
  • The source code for your program, complete with comments.
  • A makefile that will build the code by typing "make" on Saratoga and produce a binary called "flow" that also runs on the Saratoga cluster.
  • A plain text file containing a write up of what you did, analysis of the performance you observe, and details of what you believe to be the largest calculation you can run. Name this file "writeup.txt".

Marking Scheme

Below is a rough indication of how marks will be allocated:
  • Content of readme file, that the code compiles, builds and runs without obvious errors (2-4 marks)
  • What the code does as problem characteristics and number of processes used is changed. Your mark here will be derived from running the code and checking the results are correct, looking at the source, and reading the content of the writeup. (5-7 marks).
  • Observed Performance obtained and maximum system size possible. (5-7 marks)
  • Content of writeup and analysis. (4-6 marks)