Modern High-Performance Computing (HPC) systems are composed of heterogeneous hardware, including a diverse mix of accelerator devices to which compute kernels can be offloaded. Historically, achieving good performance on these devices has required multiple programming languages to be used, typically requiring a different optimized implementation for each new device. The performance of devices is therefore ultimately tied to the language used and implicitly the underlying framework, run-time and compiler used -- known as the back-end. Benchmark suites have been developed to compare the performance of different language/device combinations.
In this project, we aim to assess the consistency of standard benchmarks across different implementations---whether they truly perform an apples-to-apples comparison in functionality of codes and their underlying workload characteristics. In particular, we focus on Rodinia, which was specifically developed to compare languages and associated back-ends by duplicating applications in multiple languages, namely, OpenCL, CUDA and OpenACC. Recently, we’ve also added OpenMP and SYCL versions of a majority of Rodinia benchmarks.
This assessment will use the Architecture-Independent Workload Characterization (AIWC) tool to compare benchmark versions, identify and describe inconsistencies, and modify benchmarks as appropriate to support fair comparison. The output of this project will be in the form of improvements to existing benchmark suites, performance measurements, and documentation, made available as free software.