COMP8320 Tutorial 04 -- week 5, 2011
Software Engineering for Multicore
Please read the article mentioned below
before the tutorial.
If you have any unresolved questions or things you would like explained
from earlier in the course, please have these ready and let your tutor
know tutor at the start of the session. This can include anything about
Assignment 1!
Afterwards, discuss in small groups the following questions.
From the first article:
- What software architecture is implied by Figure 1 (for Biological
Data Analysis)? Re-draw the figure for the appropriate
software architecture in the style indicated in Lecture 6.
- What concurrency model (design pattern) for the overall
application is implied by Figure 1? Identify the patterns used in Figure
2. Why was it necessary that it became more complex?
- Evidently, this application was implemented using (Posix) threads.
Discuss whether this was necessary (for example, could you have used OpenMP)?
- Discuss why tuning is needed for such a software architecture,
and discuss why the main speedups came from the pipeline layer.
-
Elaborate the software architecture and concurrency models for the
Monte Carlo application using Strategy 2 (draw diagrams).
-
Figure 4 indicates performance of the application up to 64 threads.
Critically assess why the authors considered more than 8 threads,
considering the machine used for the experiments. Discuss how you
would think critical sections and atomic updates are likely to be
implemented in OpenMP. What do the results indicate on the
implementation of locks in the two different run-time systems
evaluated (hint: adaptive, scalable).
-
What software architecture would be appropriate for branch-and-bound
algorithms, such as the TSP? (consider the main data structures mentioned).
-
The T2 obtained better scalability than the dual-socket Quad-core Xeon
for the TSP. The authors mention L2 cache architecture was a factor; what
unusual aspect of the T2 cache do you think is responsible?
From the second article:
- In BZip2, why is parallelization over file blocks likely to be
the most cost-effective approach to parallelization? What aspect of the
application brings a complication in this, and how would you deal with
it?
- Why was a significant degree of code-refactoring needed for
the BZip2 codes used?
- Critique the strategies of Teams 2 and 3. What was the purpose
of performing profiling?
- Why is OpenMP not a suitable paradigm for this application?
-
From the point of view of state-of-the-art Software Engineering (for
example in process and quality control), what was lacking in all the teams'
approaches?
The source code for the BZip2 used is this paper may be obtained from
wallaman:/dept/dcs/comp8320/public/tut04/bzip2-1.0.5.tar.gz.
Last modified: 2/09/2011, 10:50