HOW FAST IS -FAST? PERFORMANCE ANALYSIS OF KDD APPLICATIONS
USING HARDWARE PERFORMANCE COUNTERS ON ULTRASPARC-III

Adam Czezowski and Peter Christen
Australian National University

Abstract

Modern processors and computer systems are designed to be efficient and achieve high performance with applications that have regular memory access patterns. For example, dense linear algebra routines can be implemented to achieve near peak performance. While such routines have traditionally formed the core of many scienti c and engineering applications, commercial workloads like database and web servers, or decision support systems (data warehouses and data mining) are one of the fastest growing market segments on high-performance computing platforms. Many of these commercial applications are characterised by more complex codes and irregular memory access patterns, which often result in a decrease of performance that is achieved. Due to their complexity and the lack of source code, performance analysis of commercial applications is not an easy task. Hardware performance counters allow detailed analysis of program behaviour, like number of instructions of various types, memory and cache access, hit and miss rates, or branch mispredictions. In this paper we describe experiments and present results conducted with various KDD applications on an UltraSPARC-III platform, and we compare these applications with an optimised dense matrix-matrix multiplication. We focus on compiler optimisations using the -fast and discuss di erences in unoptimised and optimised codes.