2

Is it possible to schedule a given task to run exactly n machine instructions before control is returned to the user?

The motivation for this question is the debugging of multithreaded programs, where this could be helpful to reliably reproduce certain bugs or undefined behaviour.

I'm particularly interested in the case of x86_64-linux running on an Intel CPU, but solutions for other architectures or operating systems would also be interesting.

The documentation for the kernel perf suite says

Performance counters are special hardware registers available on most modern CPUs. These registers count the number of certain types of hw events: such as instructions executed, cachemisses suffered, or branches mis-predicted - without slowing down the kernel or applications. These registers can also trigger interrupts when a threshold number of events have passed.

so it seems like the hardware could support this in principle, but I'm not sure if this is exposed in any way to the user.

Of course it's possible to just use ptrace to single-step the program n times, but that would make all but the most simple programs impossibly slow.

Benno
  • 5,288
  • 5
  • 42
  • 60
  • 2
    I question the relevance of a simple instruction-count on a pipelined superscalar out-of-order cpu with multi-level caches, non-sequential-conistency memory model and variable cpu frequency. – EOF Sep 17 '15 at 14:44
  • The thorn or difficulty is that most processors have facilities for interrupting by time (usually when a timer expires), and not for instruction count. You will have to check your platform to see what facilities exist for instruction counting. Next, check if your OS provides an API or interrupt handler for this (the ISR will get called when the instruction counter transitions to zero). – Thomas Matthews Sep 17 '15 at 16:46

1 Answers1

1

One simple option to ensure an exact count of the instructions executed is to instrument the assembly code and maintain an execution counter. I believe the easiest way to do instrumentation is Pin ( https://software.intel.com/en-us/articles/pintool ).

High level idea: - interpret machine code and maintain a counter of the number of instructions executed,

  • after each instruction you increment the counter and check if it is time for a breakpoint,

  • reset counter after each breakpoint.

The interpretation idea would introduce quite a bit of overhead. I see a number of straightforward optimizations:

  • Instrument binary statically (create a new binary where all these increments/checks are hard coded). Such an approach would eliminate instrumentation/interpretation overheads. You can consider the instructions related to monitoring/breakpoints as extra instructions executed or chose to ignore them from the counting.

  • The increments/checks can be more smartly implemented. Imagine we have a set of instructions with no jumps/branches you can do one increment and one check. This idea is simple but might prove pretty tricky in practice, especially if you need an absolutely accurate breakpoint..

Radu
  • 1,098
  • 1
  • 11
  • 22