COMP3320/COMP6464 Molecular Dyanmics Project
Due by 18:00 on Thu 5 April 2012
This assignment is worth 20% of your total course mark.
- For COMP3320 students your assignment will be marked out of 40
- For COMP6464 students your assignment will be marked out of 50
Details of how marks are assigned and what you are required to submit are given below.
The aim of the project is to develop an elementary molecular dynamics (MD)
code and associated visualization software. MD is extremely widely
used in computational
chemistry, physics, biology, material science etc. It was also one of
the first scientific uses of computers going back to work by Alder and
Wainwright who, in 1957, simulated the motions of about 150 argon
atoms. (See Alder, B. J. and Wainwright,
T. E. J. Chem. Phys. 27, 1208 (1957) and J. Chem. Phys. 31, 459 (1959))
In MD each atom is represented as a point in space. The atoms interact
with each other via pairwise potentials, in a
similar manner to gravity interacting between the planets, sun, moons
and stars. In practice this is an approximation since for atoms quantum
effects are often important - but we'll ignore this and
confine our discussions to "classical" MD.
Gravitational bodies interact via a potential that goes as
m1m2/r12 where m1 and
m2 are the masses of the bodies involved
and r12 is the distance between their centres of mass. For the
purpose of this work we will use a potential of the form
p12 = 1/(r12)12 - 2/(r12)6
To evaluate the interaction for many bodies we must simply perform a
sum over all unique two body interactions. Thus for 3 particles we
will have interactions 1-2, 1-3 and 2-3.
Consider now a cubic box with a side length of R. Along each
dimension of the box we arrange N atoms as indicated by the following
figure (not all atoms are shown!), such that there are a total of N*N*N atoms.
For this configuration of atoms we could calculate the total interaction energy
as a sum over all atoms of all the pairwise interactions of that atom
with all others - we term this the total "potential" energy (PE) of
the system.
STEP 0: Visual display and potential energy evaluation (preliminary, no mark)
Your initial objective is to produce a basic code that can read as input
N (the number of atoms in each dimension) and R (the size of the box
in each dimension). Your code should also compute and print out the
value for the potential energy.
Develop this code using Visual Python (VPython).
Your code should display a visual window showing all the atoms
neatly arranged in a cube, and you should printout the total
potential energy (in the text window).
Note that your python code, call it md1.py should accept the following arguments in order, i) N - number of atoms in each dimension of the box ii) R - the size of the box in each dimension, iii) tstep - the time step, and iv) iter - the number of time steps in your simulation. It should therefore run as follows:
python md1.py N R tstep iter.
STEP 1: Molecular Dynamics: force, velocity, and movement (10 marks)
Given the pairwise potential there will be a pairwise force. This
corresponds to the derivative of the potential with respect to R12.
Assuming the location of each atom is given in Cartesian coordinates
the Cartesian forces are given by:
Fx = (12/r1214 - 12/r128)(x1-x2)
Fy = (12/r1214 - 12/r128)(y1-y2)
Fz = (12/r1214 - 12/r128)(z1-z2)
The total force on a given atom will be the sum of all the
pairwise forces.
Assuming that the forces on at least some of the atoms are non-zero
then the atoms will want to move. MD is the process of moving the
atoms according to the forces they are subjected to. Moving the atoms
equates to integrating the equations of motion over time or solving a
differential equation. There are a variety of ways in which we can do
this, but a popular method is the so called "velocity verlet"
algorithm (W.C Swope et al J. Chem. Physics 76 637
(1982)). This has the following form
coord(t+tstep) = coord(t) + tstep * (velocity(t) + tstep*accel(t)/2)
velocity(t+tstep) = velocity(t) + tstep * (accel(t) + accel(t+tstep))/2
where t is the current time, and tstep is the timestep (the
integration step). And by assuming that all particles have unit mass we can
replace the acceleration by the force (F = ma).
When the particles begin to move they will gain kinetic energy
(KE = 1/2 mv2) with the total KE just a simple sum of all the
atomic KEs. If there is no source or sink of energy then the total
energy (PE+KE) should remain constant, so the gain in KE is offset by a lose
of PE.
Extend your md1.py to compute the force on each atom and allow the
particles to move. Your code will need to also read in the timestep and the
total number of timesteps and to compute and printout the
potential, kinetic and total energy at each timestep.
HERE are some sample outputs for you to
verify that your results are correct
Step 1 Submission Requirements
You are required to submit your md1.py code.
The marks for this step will be given based on the following:
- compliance with submission process as detailed at end of this assignment;
- the readability and structure of your code;
- whether your code can display structures correctly;
- the correctness in calculating the energy values, forces, and velocities;
- whether your code generates the correct format as the sample output.
STEP 2: Python/C interfacing (10 marks)
In this step you are required to re-write the compute intensive parts
of your Python code in C, and modify your Python code to call the C
module. The Python code should read in the various parameters and
display the positions of the atoms as they move. The Python code
should communicate the starting parameters to the C code. This should
then run the MD simulation and periodically to communicate the
coordinates of the atoms back to the Python program to be
displayed. It is not expected that you
communicate the coordinates at every timestep.
In addition to the visual output, you should provide text output
that details the input parameters, and provides the total energy,
potential energy and kinetic energy periodically during the course
of the simulation.
Within the Python portion of your code use time.clock()
routine to report the total time taken to perform the simulation. The
start time is from the beginning of the first simulation step, while
the end time is after the last iteration. This means you do not have
to include setup time, but once the clock is started it should not
be stopped again until everything is complete.
Here are some useful links on Python/C API and the example codes presented in the course lecture:
Step 2 Submission Requirements
Name your modified Python code md2.py and your C code
md2Module.c. You are required to submit there two files
together with a Makefile that will compile the C code and
create the relevant shared object library.
- Your md2.py code will be tested on PARTCH - so make sure
it works without errors on this machine. The paths to
the Python include files in Partch for your C compiler are
/usr/include/python2.6 and /usr/include/python2.6/Numeric. You may
need to use -fPIC option in your compiler to cope with the bugs in the
build system, and use the -shared option to link the shared library.
(If you develop your code on another machine you will need to work out
these details for that machine)
- Your code will be run from the command line by typing
python md2.py N R tstep iter update
where the new "update" argument is an integer specifying how often the
python code should update the positions of the atoms. A value of 5 would imply that Python only visually moves the atoms every 5th timestep. Also only print out the energy values every 5th timestep.
- Your code should generate the same output as for md1.py
The marks for this step will be given based on the following:
- compliance with submission process as detailed at end of this assignment;
- the readability and structure of your Python and C codes;
- your choice of what is done in Python and what is done in C;
- correct interfacing between Python and C code;
- correctness of your code in calculating energy values, forces, velocities;
STEP 3: Accuracy and performance analysis of Python and C codes
(14 marks)
You have now developed both Python and Python/C versions of the same
application code. In this step you are required to analyze and compare
their accuracy and performance. Your are required to submit a short
report of no more than 4 A4 pages addressing the issues raised
below. You are encouraged to use the tools introduced to you during
the labs and lectures, and present your results using tables, graphs
and figures as you see fit.
Step 3 Submission Requirements
You are required to submit a report titled Step3_Report.pdf
addressing the following issues:
- Compare the numerical results printed by md1.py with those
of md2.py for a range of different input values. Include input
parameters that lead to results where rounding is the dominant source
of numerical error and input parameters that lead to results where
truncation is the dominant source of numerical error. Clearly
identifying in your report which case is which and why. (4 marks)
- What is the per-iteration (i.e. each timestep) execution time of
md1.py and md2.py as a function of different problem
dimensions? In each case detail how you expect your code to scale as a
function of this input parameter, then provide performance data to
validate or invalidate your expectation. If you were asked to say what
the largest possible calculation was that you could perform with each
version of the code - what would your answer be? (4 marks)
- What is the most computationally intensive component of your code
in terms of time taken to execute? Is this the same for both versions
of the code? During each iteration, what proportion of the total
execution time is spent calculating the new i) coordinates, ii)
forces, and iii) velocities? How does this proportion change as
problem dimension increases? Does the most computationally intensive
component correspond to the lines of code that is executed most often?
(4 marks)
- Good data structure and algorithms can improve the efficiency of
your code. Describe in plain language what you have done to improve
the efficiency of your code, for example, in sequence of accessing the
atoms, and so on. (2 marks)
STEP 4: Benchmarking and Performance Tuning (6 marks)
In lectures we discussed the role of benchmarks. In this step you are
required to convert your md code into a benchmark that will run on
both Partch and NCI XE system (IDs for the XE will be
distributed in the labs in week 3). For the benchmark produce
a pure C version of your code, which can perform the
same functions as listed in step 2 and 3, but without displaying the
atoms and only printing energies every 10 timesteps. We call this
version mdc.c
Step 4 Submission Requirements
You are required to submit mdc.c and a report titled
Step4_Report.pdf (max 2 A4 pages).
Your mdc.c code should
- Compile on Partch and the XE using the same Makefile as mentioned
above by typing either
Make mdc_partch or Make mdc_XE.
- In both cases the executable produced should be called mdc.exe
- The executable should be run by issuing the following command
mdc.exe N R tstep iter
where the parameters are as defined above.
- Tune the code to achieve the best possible performance on the two
platforms.
- On the XE system you may want to note that there are a number of
different compilers available that can be activated by loading
different modules.
Your report should consider the performance of the benchmark on both
platforms. Is the performance what you expected? How did you tune the
benchmark to run fast on each platform?
Marks for this step will be given based on:
- compliance with submission process as detailed at end of this assignment;
- readability and quality of your benchmark code;
- correctness of the benchmark test results;
- how fast your code runs;
- quality of interpretation and analysis of the results in terms of correctness, breadth and depth.
STEP 5 (COMP6464 ONLY): Structure Optimisation (10 marks)
Your objective is to determine the value of R that minimizes
the total potential energy for a given value of N. We will do this
numerically.
This process works as follows. Given an initial size R and number of
atoms N, set a range [R1, R2] so that R is in [R1, R2]. Divide
[R1, R2] into evenly distributed test values of R. For each test value
run you md code to get the value of the total potential energy at
this R value. Which test value of R has the lowest potential
energy? Revise your test range around this value and repeat the
process using R values that are closer together. Continue the
process until you have determined the optimal value of R to within
some numerical threshold.
In what follows you can perform the above manually, or you can write a
python script to do this. You can also use more elaborate minimization
techniques. Some useful numerical implementation
of the minimisation methods can be found in the following book:
William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian
R. Flannery. Numerical Recipes in C - The Art of Scientific
Computing . Second Edition, Cambridge University Press, 1992.
Step 5 Submission Requirements
You are required to submit a report titled Step5_Report.pdf
(max 2 A4 pages). The report should address the following
- For N=5 and N=10 what are the values of R that minimizes the total
energies of these two systems. Provide details to explain how you
determined these values. How accurate are each of your reported values
for R?(4 Marks)
- For N=5 and N=10 and with the optimized R values determined above
what is the spacing between adjacent atoms? Is this value larger or
smaller for the N=5 system compared to the N=10 system? Explain this
observation.(2 Marks)
- For N=5 and N=10 and with the optimized R values determined above
is the force on each individual atom zero (to within the uncertainty
associated with R)? Run one of your md codes to verify whether you are
right. Explain what you did in verifying this and why you observe what
you do from your code.(4 Marks)
Submission Details
All your submitted code should include this header at the top:
Name:
Student Number:
Course:
Assignment Number:
Name of this file:
Lab Group:
I declare that the material I am submitting in this file is entirely
my own work. I have not collaborated with anyone to produce it, nor
have I copied it, in part or in full, from work produced by someone else.
I have not given access to this material to any other student in this course.
You are required to submit a tar file using the submit tool on one of
the student computers:
submit comp3320 md_project md.tar
or if you are a COMP6464 student
submit comp6464 md_project md.tar
The tar file should expand to give the following:
- a directory called YOURSTUDENTID_ass1_part1 (such as u1000000_ass1_part1)
that contains md1.py, md2.py, md2Module.c, and Makefile. This
directory should also contains a plain text file called README that
tells the users (me) exactly how to run your code and either contains
or points the user to some sample input/output data so they can verify
that your code works (or gives identical results to what you got).
This need not be as detailed as the report for Step 3.
In the same directory, you should also include a plain text file
called WRITEUP that details how long it took you to complete
this part of the project and details of who you collaborated with (if anyone).
- a directory called YOURSTUDENTID_ass1_part2 which contains
mdc.c in step 4, and the Makefile. A README file shall be provided
to explain how to run the codes. You also should put your Step3_Report.pdf and Step4_Report.pdf here.
- (COMP6464 student only:) a directory called YOURSTUDENTID_ass1_part3 that contains Step5_Report.pdf. If you choose to write a Python code for the optimisation task, please name your code md3.py, and put it in this directory with a README file.