4

I would like to utilize Intel TSX to program lock-free code.

xbegin
my_inst1
my_inst2
xend

However, because of some reasons, one of my instructions inside TSX execution TSX abort.

I would like to know which instruction generates the fault and make the TSX abort.

Is there any possible way to know which instruction generated fault?

My first try was incrementing the global counter after executing each instruction in the TSX region. However, when the fault happens, updates to the counters also rollbacked because it rollbacks every writes in the TSX region.

Is there any trick to debug TSX execution?

Alex Guteniev
  • 12,039
  • 2
  • 34
  • 79
ruach
  • 1,369
  • 11
  • 21

1 Answers1

6

Use perf record (or other way of accessing HW perf counters) for an event like rtm_retired.aborted for any aborts, and/or tx_mem.abort_conflict or tx_mem.abort_capacity to see if either of those are the cause for aborts. (You can record multiple events in one run, then see which fired in perf report)

Also tx_exec.misc1..3 might be relevant. From perf list on my Skylake desktop.

tx_exec.misc1
[Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort]

tx_exec.misc2
[Counts the number of times a class of instructions (e.g., vzeroupper) that may cause a transactional abort was executed inside a transactional region]

tx_exec.misc3
[Counts the number of times an instruction execution caused the transactional nest count supported to be exceeded]

See also https://oprofile.sourceforge.io/docs/intel-skylake-events.php

You might need to tweak things to get a reasonable number of samples for an event that doesn't fire very often. I haven't tried this, but hopefully the counts should show up on the guilty instruction itself. rtm_retired.aborted is a precise event; the others don't say so in perf list output.


Some of the RTM/TSX events are only for HLE (Hardware Lock Ellision, where you put an extra prefix on a locked instruction).

Use perf list and search for "abort" in the output to find relevant events.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847