ANU Computer Science Technical Reports
TR-CS-02-03
Adam Czezowski and Peter Christen.
How fast is -fast? Performance analysis of KDD applications using
hardware performance counters on UltraSPARC-III.
September 2002.
[POSTSCRIPT (153640 bytes)] [PDF (302255 bytes)] [EPrints archive]
Abstract: Modern processors and computer systems are
designed to be efficient and achieve high performance with applications that
have regular memory access patterns. For example, dense linear algebra
routines can be implemented to achieve near peak performance. While such
routines have traditionally formed the core of many scientific and
engineering applications, commercial workloads like database and web servers,
or decision support systems (data warehouses and data mining) are one of the
fastest growing market segments on high-performance computing platforms. Many
of these commercial applications are characterised by more complex codes and
irregular memory access patterns, which often result in a decrease of
performance that is achieved. Due to their complexity and the lack of source
code, performance analysis of commercial applications is not an easy task.
Hardware performance counters allow detailed analysis of program behaviour,
like number of instructions of various types, memory and cache access, hit
and miss rates, or branch mispredictions. In this paper we describe
experiments and present results conducted with various KDD applications on an
UltraSPARC-III platform, and we compare these applications with an optimised
dense matrix-matrix multiplication. We focus on compiler optimisations using
the -fast flag and discuss differences in un-optimised and optimised codes.
Technical Reports <Technical-DOT-Reports-AT-cs-DOT-anu.edu.au>
Last modified: Tue May 31 12:56:01 EST 2011