0

I would like to prove that while a given core is at 100% while doing top, it is actually spin-locked. What would be the best metric to use in perf to clearly demonstrate this? Presumably some high ratio of stalled front-end to Back-end cycles?

kvantour
  • 25,269
  • 4
  • 47
  • 72
filimon
  • 61
  • 4
  • Attach to it with GDB and single-step it. If it's reading a memory location and branching based on that, it's stuck in a spin loop, waiting for some other thread to modify that memory. Are you on x86? I wouldn't want to try to guess from `perf`, unless you can detect the x86 `pause` instruction from the extremely low IPC (on Skylake and later). On Intel, `mem_inst_retired.lock_loads` should help detect spinning on a `lock cmpxchg` or `xchg` directly, but most good spin loops should spin read-only until it appears that an attempt could succeed. – Peter Cordes Sep 14 '18 at 17:27
  • Hi, thanks, gdb is a good suggestion. I build and deploy the code, we don't have pause but rather nop as the codebase is old and has explicit inline assembly. Can you please elaborate on the second part of your answer >On Intel, mem_inst_retired.lock_loads should help detect spinning on a lock >cmpxchg or xchg directly, but most good spin loops should spin read-only >until it appears that an attempt could succeed., in particular do you mean that I should use these lock loads as a metric in perf and can you point on how to spin read-only? – filimon Sep 16 '18 at 12:41
  • See [Locks around memory manipulation via inline assembly](https://stackoverflow.com/a/37246263) for a minimal spinlock that spins read-only on a load + `pause` until it looks like `xchg` could succeed. It doesn't include exponential backoff or a fallback to `futex` or other OS-assisted sleep/wake, though. – Peter Cordes Sep 16 '18 at 19:41

0 Answers0