I'm following this whitepaper by intel to benchmark code execution.
It uses cpuid
to fence the reads of the timestamp registers, which seems to work alright.
I'm more interested in the commands preempt_disable()
and local_irq_save()
, used to prevent any interference while measuring.
When I'm running a benchmark like this, measuring nothing, I get an average of 24 cycles. However, around 10 of 100'000 measurements take a "long" time, i.e. multiple 10k cycles. What is the root of theses spikes and how can I get rid of them?
I wrote a minimalkernel module to do the measurements, source code can be found on github.
I'm running an Ubuntu 18.04.5 LTS with a 4.15.0-118-generic kernel on an Intel(R) Xeon(R) Silver 4215 CPU @ 2.50GHz. I've tried to run the benchmark on an i9-9900 but got the same spikes there.