Performance Counting for Cycles Inconsistent and not Reflecting CPU Frequency

Question

Intro: I have written a Linux kernel module for performance counter monitoring on an ARM v7 platform with cortex A-15 and A-7 processors (Odroid XU3). One counter I am trying to use in my research is cycle counts, which from the ARM technical reference manuals has its own dedicated counter. I have checked my code against other implementations and ARM references found online; here is a snippet of the part that enables the CPU counters:

Resources Used:

Problem: When I print the cycles elapsed over a fixed sampling period (100ms) for a fixed CPU frequency (1.4GHz in the case of core 0), I see a huge amount of variance in the values returned by the module. See the chart below for an example of this. Not only does the variance seem very high, but the number of cycles measured does not reflect the number of cycles I would expect to see recorded given the sample time and fixed frequency (for the given scenario I expected 1.4e8 cycles on each sample). What could be causing such divergence from the expected number of cycles?

Variability of measured cycles for kernel module running across all cores and across just core 0.

the cpu frequency is only part of the equation, it should have some factor but certainly no reason to expect anything linear. the processor spends a fair amount of time waiting on instructions or data. how much of the measured time is tied to cpu frequency and how much system is something you will figure out — old_timer, May 25 '18 at 20:05
you are also running an operating system so consistent and repeatable times are not expected. — old_timer, May 25 '18 at 20:18
@old_timer Even if the cpu is waiting on data, shouldn't cycles still be counted as long as the clock is active (so the CPU is not in a low-power sleep state)? — blancm, May 31 '18 at 15:05
yes wall clock time vs wall clock time should be accurate to one tick each measurement...but instruction cycles vs wall clock time should vary with operating system capable processors running on dram, etc... — old_timer, May 31 '18 at 17:22

score 1 · Answer 1 · answered May 25 '18 at 18:08

After further though and discussions with colleagues, I believe the discrepancy between measured and expected cycles is cpuidle: it is a subsystem in the Linux kernel that places a CPU core into a lower-power state when the core is not doing anything. Some of the lowest states shut down the clock, which likely causes the cycle counter to stop incrementing. This article gives a nice description of cpuidle and how it works: https://lwn.net/Articles/384146/

Performance Counting for Cycles Inconsistent and not Reflecting CPU Frequency

1 Answers1