7

I want to record synchronization operations, such as locks, sempahores, barriers of a multithreaded application, so that I can replay the recorded application later on, for the purpose of debugging.

On way is to supply your own lock, sempaphore, condition variables etc.. functions that also do logging, but I think that is an overkill, because down under they must be using some common synchronization operations.

So my question is which synchronization operations I should log so that I require minimum modifications to my program. In other words, which are the functions or macros in glibc and system calls over which all these synchronization operations are built? So that I only modify those for logging and replaying.

pythonic
  • 20,589
  • 43
  • 136
  • 219
  • How will recording all synchronization operations allow you to "replay the recorded application"? There's a lot more to a program than the order of synchronization operations, especially if there are race-conditions due to missing synchronization (which is also the sort of problem for which I imagine you'd want to have debugging help). – Mark Byers May 01 '12 at 20:46
  • For the moment, forget about race conditions. I know that they are important too. – pythonic May 01 '12 at 20:51
  • [`strace(1)`](http://linux.die.net/man/1/strace) might be helpful, but it won't catch synchronization operations that occur entirely in userspace without transitioning to the kernel, such as uncontested mutex locks. – Adam Rosenfield May 01 '12 at 20:51
  • I don't think that this can be done without considerable OS support which I'm fairly sure will not be forthcoming. Threads become ready on driver interrupts as well as signals from other threads. The behaviour of a preemptive multitasker can be described as pseudo-chaotic, indeterminate on a macro scale and essentially unrepeatable. – Martin James May 01 '12 at 20:51
  • Lets assume we don't have asynchronous signals, although I know how to replay them, but for the moment lets just concentrate on sychronization operations. – pythonic May 01 '12 at 20:53

3 Answers3

1

In your case, an effective method of "logging" systems calls on Linux may be to use the LD_PRELOAD trick, and over-ride the actual system calls with your own versions of the calls that will log the use of the call and then forward to the actual system call.

A more extensive example is given here in Linux Journal.

As you can see at these links, the basic gist of the "trick" is that you can make the system load your own dynamic library before any other system libraries, such as pthreads, etc., and then mask the calls to those library functions by placing your own versions of those functions as the precendent. You can then, inside your over-riding function, log the use of the original function, as well as pass on the arguments to the actual call you're attempting to log.

The nice thing about this method is it will catch pretty much any call you can make, both a function that remains entirely in user-land, as well as a function that will make a kernel call.

Community
  • 1
  • 1
Jason
  • 31,834
  • 7
  • 59
  • 78
1

The best I can think of is debugging with gdb in 'record' mode:

According to this page: GDB Process Record threading support is underway, but it might not be complete yet.


Less strictly answering your question, may I suggest

On other platforms, several other threading checkers exist, but I haven't got much experience with them.

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
1

So GDB record mode doesn't support multithreading, but the RR record/replay system absolutely does: https://rr-project.org/ .

For a commercial solution with fewer technical restrictions, there's also UDB: https://undo.io/solutions/ .

I've worked on debuggers for some years now and from what I've seen, the GDB record+replay stuff is really not ready for primetime, for this and other reasons (eg, slowdown & huge memory requirements).

If you can get it to work in your dev environment, record+replay/reversible debugging can be pretty gamechanging for your workflow; I hope you find a way to leverage it.

Lee Marshall
  • 302
  • 1
  • 3