3

I have an application that creates 2 threads in additional to the main.

The latter 2 threads will be moved to core_id_2 and core_id_3 (via pthread_setaffinity_np).

When I ran taskset -c [core_id_1] perf mem record -c [core_id_2] [executable] and then perf mem report.

It shows that the application had zero memory load operations while having 14M store operations.

Did I run the command incorrectly? The application was reading a few GB file in a tight loop. Not sure why it recorded zero.

I ran htop in parallel to confirm that core_id_2 was indeed running the application (taking up 100% CPU).

Any suggestion?

Was running on a kernel 3.10.0-1062 machine.

HCSF
  • 2,387
  • 1
  • 14
  • 40
  • You weren't perhaps using a Broadwell CPU, with kernel permissions implying `--all-user`? [Why does perf stat not count cycles:u on Broadwell CPU with hyperthreading disabled in BIOS?](https://stackoverflow.com/q/75601428) - if so, probably erratum BDE104 – Peter Cordes Mar 30 '23 at 18:34
  • @PeterCordes the machine has Intel Xeon Gold 6524 (Cascade Lake). I don't have the permission to run with `--all-user`. – HCSF Mar 31 '23 at 00:07
  • Ok, then not that erratum. But IDK what you mean about permissions. `perf stat --all-user` implies `-e cycles,instructions` is actually `cycles:u,instructions:u`, user-space only. It takes strictly less permission (via perf_event_paranoid) than also counting events in kernel mode. Maybe your kernel and `perf` version are so ancient that they don't know about that option anyway, but still, if `paranoid` setting is high enough, you're still restricted to only counting user-space events. Not that this would explain not seeing any loads on a non-Broadwell CPU. – Peter Cordes Mar 31 '23 at 01:43
  • Linux 3.10 was originally released in 2013, before over 3 years before Skylake / Cascade Lake Xeons. Maybe microarch-specific event numbers have been backported, but if not maybe it's just too old to know how to program the PMU counters for the appropriate events on hardware way newer than the software. – Peter Cordes Mar 31 '23 at 01:46
  • Sorry, my bad. I thought -a and —all-user were the same. Running with the former complained about some permission issue. Just re-ran with perf mem record —all-user, and perf mem report still shows zero load – HCSF Mar 31 '23 at 03:50
  • perf_event _paranoid is 2 currently. In case it matters – HCSF Mar 31 '23 at 03:53
  • The fact that you're able to get non-zero counts for any HW event is good evidence that it's not a permission problem. The reason I mentioned `--all-user` in the first place was that if permissions were limiting all events to counting in user-space, that would be part of the condition for the Broadwell erratum to matter, if you'd been on a Broadwell CPU. Since you're not, it's irrelevant. – Peter Cordes Mar 31 '23 at 04:36

0 Answers0