Description
Modern software optimization, adaptive runtime management, and security analytics
all depend on hardware performance counters as a low-overhead lens into microarchi-
tectural behavior. Yet vendors provide no accuracy guarantees. Prior studies report
over-/undercounting and non-determinism even for allegedly exact events, and method-
ological fragmentation hinders a consolidated reliability picture. This thesis investigates
the residual run-to-run variability of selected architectural and model-specific events
under a near contamination-free environment. We design an extendable framework
combining stringent isolation and a benchmark suite, which results in a tiered classifica-
tion of event trustworthiness. During our development, we found that many hardware
events can be triggered by a few simple test instructions. We show that even in an
isolated test environment and with a few tests with commonly used instructions, many
hardware events return inconsistent results.
|