Intro: I have written a Linux kernel module for performance counter monitoring on an ARM v7 platform with cortex A-15 and A-7 processors (Odroid XU3). One counter I am trying to use in my research is cycle counts, which from the ARM technical reference manuals has its own dedicated counter. I have checked my code against other implementations and ARM references found online; here is a snippet of the part that enables the CPU counters:
Resources Used:
- How to measure program execution time in ARM Cortex-A8 processor?
- http://neocontra.blogspot.se/2013/05/user-mode-performance-counters-for.html
- https://pietrotech.wordpress.com/2016/09/28/sample-performance-counters-on-little-and-big-cluster-on-odroid-xu3-processor-exynos-5422/
- ARM Reference Manual (Ch. 11, PMU)
Problem: When I print the cycles elapsed over a fixed sampling period (100ms) for a fixed CPU frequency (1.4GHz in the case of core 0), I see a huge amount of variance in the values returned by the module. See the chart below for an example of this. Not only does the variance seem very high, but the number of cycles measured does not reflect the number of cycles I would expect to see recorded given the sample time and fixed frequency (for the given scenario I expected 1.4e8 cycles on each sample). What could be causing such divergence from the expected number of cycles?
Variability of measured cycles for kernel module running across all cores and across just core 0.