0

I was recently able to set up Intel performance monitoring for processors using Sandy Bridge micro-architecture to monitor for split-lock errors, which can be highly detrimental to performance and speed in code that runs frequently. Now that I have been able to use this to locate and fix these errors, I was curious of any other types of events that I could monitor for that could negatively effect performance.

Which events take the biggest toll on code-speed / efficiency?

List of events available to me can be found here in Chapter 19: https://software.intel.com/sites/default/files/managed/7c/f1/253669-sdm-vol-3b.pdf

Thanks!

user3768274
  • 103
  • 1
  • 10
  • Intel has a section in their optimization manuals about things to check for as potential problems. https://software.intel.com/en-us/articles/intel-sdm#optimization. Obviously cache misses and branch misses are major ones. You can check for long dependency chains or other execution bottlenecks by looking for front-end stalls (`uops_issued.stall_cycles` for example, or other counters to look for cycles when fewer than 4 uops were delivered). – Peter Cordes Sep 27 '17 at 21:33
  • There's no really good way to answer this other than "whichever event your loop bottlenecks on". In some ways, this is the same kind of question as [Deoptimizing a program for the pipeline in Intel Sandybridge-family CPUs](https://stackoverflow.com/questions/37361145/deoptimizing-a-program-for-the-pipeline-in-intel-sandybridge-family-cpus/37362225#37362225), where my answer catalogued a lot of ways you can stall the pipeline. There are perf counters to check for most of them. e.g. FP denormals, partial registers, page-split loads, cache misses. But false deps are harder to find. – Peter Cordes Sep 27 '17 at 21:37
  • You could consider the "topdown" approach promoted by Intel which is intended to find bottlenecks (and ultimately "events that are the most costly") for any piece of code. VTune implements it and Andi Kleen has a free implementation [here](https://github.com/andikleen/pmu-tools/wiki/toplev-manual) based on `perf`. – BeeOnRope Sep 28 '17 at 07:49

0 Answers0