CECS Home | ANU Home | Search ANU
The Australian National University
ANU College of Engineering and Computer Science
School of Computer Science
Printer Friendly Version of this Document

UniSAFE

Introduction to Computer Systems

Assignment 3 Specification - Discrete Fourier Transform (DFT)

The aim of this assignment is to give you some experience in evaluating the performance of a program for a particular architecture. This assignment also aims to give you some practice in scientific report writing.

The basic part of the assignment aims to be simple enough for most students to be able to complete in less then 5 hours.

The assignment will be marked out of 100. And is worth 10% of the final mark. A late penalty of 10% per day applies (these are working days and apply from after the assignment is collected. The assignment will be collected some time after noon on the Friday. Also note this penalty is caped so it does not take your final mark below 40 out of 100. Assignment work will not be accepted after the last day of semester. Extensions are possible in documented exceptional cases (e.g. medical certificate).

Architecture

You can choose any architecture that has a c compiler and you can execute code on (avoid common loggin servers like partch). The idea is not to find an architecture that gives the highest performance, but to understand the characteristics of the one that you have chosen. By default just use a computer in the CSIT labs. Provide a short summary of the system you are using. This summary should include at least: cpu, clock rate, memory size, memory type, L1/L2 cache size and type, and floating point ability of the cpu.

Discrete Fourier Transform

The discrete Fourier transform can be simply implemented via direct computation of the defining formula (See http://en.wikipedia.org/wiki/Discrete_Fourier_transform ). A simple C code implementation of this is:
void dft(double complex X[], double complex x[], int N) {
   int k,n;
   for (k=0;k<N;k++) {X[k] = 0.0;}
   for (n=0;n<N;n++) {
      for (k=0;k<N;k++) {
          X[k] += x[n] * cexp(-2.0i * M_PI * k * n / N);
      }
   }
}
This simple implementation is O(n^2) where n is the number of complex elements in the array being transformed. There are algorithmically faster approaches for solving this problem, however, in this assignment your task is to understand how the above implementation performs and how its performance can be improved without changing the underling algorithm.

Write a program that will enable you to run and time the execution of the dft function for different sized arrays (ranging from 200 to 20000). Generate a graph that shows the time that different sized arrays take to execute. Answer the following questions: Is this graph perfectly matched to a single quadratic equation? If there are changes in the shape of this graph at what point do these changes occur and why? How do these changes relate to the architecture you are using? How many clock cycles does one time around the main loop take to execute (give the average)? Where are all these clock cycles going?

(optional) Modify the code and report how these modifications effect the performance of the overall program. The types of modification you may wish to attempt include (but not limited to): changing the order in which the main loop is executed in, pre-calculating some of the expression (the pre-calculation is still timed), making use of the fact that the complex exponent is the same when n and k are swapped, changing the compiler optimisation level, using loop unrolling, and modifying the assembled code. You should not make any changes to the algorithm.

Any modification you make should state exactly what the modification was, the change in performance, and the reason behind this change in performance.

Report

The report must be at most 2 pages and contain the following:
  • A title.
  • Your name and student id.
  • Abstract - Overview of what you did and your main findings.
  • Experimental Set up - explain and justify the approach used for timing the DFT. Also provide an overview of the architecture you have used.
  • Results & Discussion - The graph of your findings along with an explanation and discussion of the results.
  • References

In-addition to the main 2 page report you may attach appendices. Please include as one of the appendices the source code of your implementation.

Use a 2-column report format with a 10 point Times-Roman font. The appendices may be single column. The report must be in pdf format that is viewable on the lab machines.

Marking

Marks will be given based on:
  • The approach taken in gathering results along with the justification for the approach used.
  • Any improvements made to the implementation.
  • The reporting of the results along with the discussion.
  • The clarity of the report.
  • The formatting of the report (this includes the formatting of the main graph).

Individual Assignment

This is an individual assignment. You may help each other, however, the final submission must be original and your own work. Take care not to give parts of your solution to other students.

Assignment Submission

The assignment must be submitted via a computer in the CSIT labs (or remotely via ssh to "partch" or one of our servers). To submit the assignment use the following command:
submit comp2300 ass3 report.pdf

A few hits for submission:

  • Don't have spaces in your file names. (the submit program does not like them!)
  • Copy the files to your home directory, rather, than having them mounted on a USB drive.
  • You may submit as many times as you like, I will mark the most recent one that is not late.
  • If you are in comp6300 then use comp6300 rather then comp2300 in the command.