12

Normally it is said that multi threaded programs are non-deterministic, meaning that if it crashes it will be next to impossible to recreate the error that caused the condition. One doesn't ever really know what thread is going to run next, and when it will be preempted again.

Of course this has to do with the OS thread scheduling algorithm and the fact that one doesn't know what thread is going to be run next, and how long it will effectively run. Program execution order also plays a role as well, etc...

But what if you had the algorithm used for thread scheduling and what if you could know when what thread is running, could a multi threaded program then become "deterministic", as in, you'll be able to reproduce a crash?

tshepang
  • 12,111
  • 21
  • 91
  • 136
Tony The Lion
  • 61,704
  • 67
  • 242
  • 415
  • Maybe in the furture, in some machine it's not allowed multithreaded programming, there's some library framework for simulating multithreaded environment, then you can trace back and reproduce the bug. – coolkid Sep 30 '10 at 14:02

8 Answers8

12

Knowing the algorithm will not actually allow you to predict what will happen when. All kinds of delays that happen in the execution of a program or thread are dependent on environmental conditions such as: available memory, swapping, incoming interrupts, other busy tasks, etc.

If you were to map your multi-threaded program to a sequential execution, and your threads in themselves behave deterministically, then your whole program could be deterministic and 'concurrency' issues could be made reproducible. Of course, at that point they would not be concurrency issues any more.

If you would like to learn more, http://en.wikipedia.org/wiki/Process_calculus is very interesting reading.

Habbie
  • 2,150
  • 15
  • 17
  • 1
    I wrote a translator from C to CSP(a kind of Process_calculus) use in PAT tool(www.comp.nus.edu.sg/~pat/). There're some Model checkers outthere for you checking your concurrent program. – coolkid Sep 30 '10 at 14:06
6

My opinion is: technically no (but mathematically yes). You can write deterministic threading algorithm, but it will be extremely hard to predict state of the application after some sensible amount of time that you can treat it is non-deterministic.

Andrey
  • 59,039
  • 12
  • 119
  • 163
3

There are some tools (in development) that will try to create race-conditions in a somewhat predictable manner but this is about forward-looking testing, not about reconstructing a 'bug in the wild'.

CHESS is an example.

H H
  • 263,252
  • 30
  • 330
  • 514
3

It would be possible to run a program on a virtual multi-threaded machine where the allocation of virtual cycles to each thread was done via some entirely deterministic process, possibly using a pseudo-random generator (which could be seeded with a constant before each program run). Another, possibly more interesting, possibility would be to have a virtual machine which would alternate between running threads in 'splatter' mode (where almost any variable they touch would have its value become 'unknown' to other threads) and 'cleanup' mode (where results of operations with known operands would be visible and known to other threads). I would expect the situation would probably be somewhat analogous to hardware simulation: if the output of every gate is regarded as "unknown" between its minimum and maximum propagation times, but the simulation works anyway, that's a good indication the design is robust, but there are many useful designs which could not be constructed to work in such simulations (the states would be essentially guaranteed to evolve into a valid combination, though one could not guarantee which one). Still, it might be an interesting avenue of exploration, since large parts of many programs could be written to work correctly even in a 'splatter mode' VM.

tshepang
  • 12,111
  • 21
  • 91
  • 136
supercat
  • 77,689
  • 9
  • 166
  • 211
0

I don't think it is practicable. To enforce a specific thread interleaving we require to place locks on shared variables, forcing the threads to access them in a specific order. This would cause severe performance degradation.

Replaying concurrency bugs is usually handled by record&replay systems. Since the recording of such large amounts of information also degrades performance, the most recent systems do partial logging and later complete the thread interleavings using SMT solving. I believe that the most recent advance in this type of systems is Symbiosis (published in this year's PLDI conference). Tou can find open source implementations in this URL:

http://www.gsd.inesc-id.pt/~nmachado/software/Symbiosis_Tutorial.html

João Matos
  • 6,102
  • 5
  • 41
  • 76
0

This is actually a valid requirement in many systems today which want to execute tasks parallelly but also want some determinism from time to time.

For example, a mobile company would want to process subscription events of multiple users parallelly but would want to execute events of a single user one at a time.

One solution is to of course write everything to get executed on a single thread. Another solution is deterministic threading. I have written a simple library in Java that can be used to achieve the behavior I have described in the above example. Take a look at this- https://github.com/mukulbansal93/deterministic-threading.

Now, having said that, the actual allocation of CPU to a thread or process is in the hands of the OS. So, it is possible that the threads get the CPU cycles in a different order every time you run the same program. So, you cannot achieve the determinism in the order the threads are allocated CPU cycles. However, by delegating tasks effectively amongst threads such that sequential tasks are assigned to a single thread, you can achieve determinism in overall task execution.

Also, to answer your question about the simulation of a crash. All modern CPU scheduling algorithms are free from starvation. So, each and every thread is bound to get guaranteed CPU cycles. Now, it is possible that your crash was a result of the execution of a certain sequence of threads on a single CPU. There is no way to rerun that same execution order or rather the same CPU cycle allocation order. However, the combination of modern CPU scheduling algorithms being starvation-free and Murphy's law will help you simulate the error if you run your code enough times.

PS, the definition of enough times is quite vague and depends on a lot of factors like execution cycles need by the entire program, number of threads, etc. Mathematically speaking, a crude way to calculate the probability of simulating the same error caused by the same execution sequence is on a single processor is-

1/Number of ways to execute all atomic operations of all defined threads

For instance, a program with 2 threads with 2 atomic instructions each can be allocated CPU cycles in 4 different ways on a single processor. So probability would be 1/4.

Mukul Bansal
  • 878
  • 8
  • 10
-2

Lots of crashes in multithreaded programs have nothing to do with the multithreading itself (or the associated resource contention).

Joe
  • 41,484
  • 20
  • 104
  • 125
-3

Normally it is said that multi threaded programs are non-deterministic, meaning that if it crashes it will be next to impossible to recreate the error that caused the condition.

I disagree with this entirely, sure multi-threaded programs are non-deterministic, but then so are single-threaded ones, considering user input, message pumps, mouse/keyboard handling, and many other factors. A multi-threaded program usually makes it more difficult to reproduce the error, but definitely not impossible. For whatever reasons, program execution is not completely random, there is some sort of repeatability (but not predictability), I can usually reproduce multi-threaded bugs rather quickly in my apps, but then I have lots of verbose logging in my apps, for the end users' actions.

As an aside, if you are getting crashes, can't you also get crash logs, with call stack info? That will greatly aid in the debugging process.

Chris O
  • 5,017
  • 3
  • 35
  • 42
  • 2
    "considering user input, message pumps, mouse/keyboard handling," - these are relatively easy to simulate (playback) in a completely deterministic manner. thread scheduling, not so much. – imre Jun 04 '12 at 13:45