ANU Computer Science Technical Reports
TR-CS-98-07
Peter Strazdins.
A comparison of lookahead and algorithmic blocking techniques for
parallel matrix factorization.
July 1998.
[POSTSCRIPT (163621 bytes)] [PDF (320444 bytes)] [EPrints archive]
Abstract: In this paper, we analyse and compare the
techniques of algorithmic blocking and ( storage blocking with)
lookahead for distributed memory LU, LLT and QR factorizations.
Concepts and some useful properties of a simplified model of lookahead are
explored, including the minimal degree of lookahead required for optimal
performance. Issues in the implementation of lookahead are discussed, which
are more involved for the cases of LLT and QR factorizations. It is also
explained how hybrid algorithmic blocking and lookahead techniques can be
implemented. Implications for parallel linear algebra library design are also
discussed. Results are given on the Fujitsu AP1000 and AP+
multicomputers, which have relatively high communication to computation to
speeds. The results indicate that both methods are superior to storage
blocking (without lookahead). They also indicate that for such machines, the
hybrid method is optimal for smaller matrices, due to savings in
communication startups. For larger matrices, algorithmic blocking gave the
best performance, due to its better load balancing properties. An exception
was LLT for the AP+, where lookahead alone gave comparable or better
performance for larger matrix sizes as well. Performance models, predicting
the minimum matrix size where lookahead becomes effective, indicate this
trend can be expected for machines with lower communication to computation
speeds, but that the range for where lookahead is superior is extended.
Technical Reports <Technical-DOT-Reports-AT-cs-DOT-anu.edu.au>
Last modified: Tue May 31 12:56:00 EST 2011