1

I'm playing around with some Boost threads, and I've written the following code:

#include <boost/thread.hpp>
#include <iostream>

volatile int threads = 0;

const int iterations = 200;

volatile char output[iterations*4];
volatile int k = 0;

void w() {
    threads++;
    for (int i = 0; i < iterations; i++) {
        k++;
        output[k] = '0';
    }
}
void y() {
    threads++;
    for (int i = 0; i < iterations; i++) {
        k++;
        output[k] = '1';
    }
}

int main(int argc, char* argv[]) {
    boost::thread a(&w);
    boost::thread b(&w);
    boost::thread c(&y);
    boost::thread d(&y);

    while (k <= iterations * 4) {}

    for (int i=0; i < k; i++) {
        std::cout << output[i];
    }
    return 0;
}

What I don't understand is why the completion of these threads appeared to wait on each other, this is shown in the output:

000000000000000000000000000000000000000011111111

I was under the impression the output would be in random order, with statements output similar to the following (expected output):

00100000111001000000010001111100000000011111111

Instead they follow, all 0, then all 1 (or vice versa).

Why is this? At this point it seems more like the threads are stacked in a random order, but still joined; ie waiting on each other before executing. I actually expected the threads++ code to possibly run at the same time in some threads, leaving them with the some thisthread value.

Disclaimer: I'm very new to Boost:threads, simply playing around, I understand race conditions are very dangerous (may lead to undefined behavior), this is just putting my hand in the fire so I can understand it.

hiddensunset4
  • 5,825
  • 3
  • 39
  • 61
  • If you're multitasking, you might want to try out OpenMP or Intel Threading Building Blocks. The 'in-thing' is to use task based programming instead of threads. – Nav Jan 31 '11 at 06:29
  • Despite the title of this post, do you intend for there to be a data race on `k`? This one could cause your program not to terminate. – Karmastan Jan 31 '11 at 07:33
  • Yes I do, but I'm all ears to hear how it could have that effect. Seeing as at this point its performing all operations as if the threads are joined (as explained). – hiddensunset4 Jan 31 '11 at 07:57
  • 1
    `k++` is not atomic for volatiles. I.e. it's implemented like `tmp = k; k = tmp + 1;`. If two threads increment at the same time, one update could be lost. When that happens, your main thread will spin forever. – Karmastan Jan 31 '11 at 17:38

2 Answers2

4

There are several things going on here that are preventing you from seeing the context switching behavior you are expecting:

  1. The threads are basically sitting in tight loops. It's not very likely that a context switch will happen in the middle.
  2. You did have an odd context switch in your example when w(2) prints 6, and then 47. It switches from 3 to 2 and back again.
  3. The access to the global variable is probably being optimized away by the C++ compiler. Making the variable volatile could help. Note that volatile is not really a solution. For example, increments may not be an atomic operation (aka fetch, increment, store), and also CPU caches can sometimes cause weird artifacts because stores to memory are not seen by all CPUs at the time they're executed.
  4. You are not using endl. Normally, I think endl is an absolutely horrible idea. But in this case it means that all the writes are being buffered up, which may cause some interesting artifacts.
  5. Additionally, cout will be thread-safe and the locks required to make this happen may also affect the way the threads context switch with each other.

A number of these problems can be solved by calling the write system call directly. This will force a system call which may give another thread the opportunity to run. It will also force the output to happen exactly when the code for it is executed.

Here is a program that will exhibit the behavior you were looking for on a multi-core Unix box. It will also show you the final value of k to demonstrate why volatile isn't really much of a solution, and how it created a race on the value k. I originally just suggested it as a hack to make your program work slightly better for producing the behavior you were looking for because in your original program it didn't really matter:

#include <boost/thread.hpp>
#include <iostream>
#include <unistd.h>

volatile int threads = 0;

const int iterations = 200;

volatile int k = 0;

void w(char thid) {
    threads++;
    for (int i = 0; i < iterations; i++) {
        k++;
        ::write(1, &thid, 1);
    }
}

int main(int argc, char* argv[]) {
    boost::thread a(&w, '0');
    boost::thread b(&w, '1');
    boost::thread c(&w, '2');
    boost::thread d(&w, '3');

    a.join();
    b.join();
    c.join();
    d.join();
    ::write(1, "\n", 1);
    // In general, do not mix write to stdout and cout. In this case
    // it will work, but this is a very limited and specific case.
    ::std::cout << "k == " << k << "\n";
    return 0;
}
Omnifarious
  • 54,333
  • 19
  • 131
  • 194
  • Ah now I'm more confused, I just ran my program through the terminal instead of through my IDE, and the counts are out of order (as they are above, just realised (I used the terminal to copy that out without even looking)). This questions completely moot now... as its obviously just to do with cout... (Which had been my original suspicion, except in my IDE all the counts were perfect order every time.) – hiddensunset4 Jan 31 '11 at 06:23
  • @Daniel - Also, I updated my answer to be a little more detailed and explanatory. – Omnifarious Jan 31 '11 at 06:34
  • Omni, in an attempt to address point 4 and 5, I decided to use a c-string with the thread letters. (http://www.pastie.org/1513855) And the out put was the same: wwwww...yyyyyyyy..wwwwww So it may not have anything to do with 'cout'. – hiddensunset4 Jan 31 '11 at 06:49
  • See updated post for what I currently have, changed to 0 and 1 (w and y was hard on the eyes lol), and just focusing on that, my count output proven they were working synchronously... – hiddensunset4 Jan 31 '11 at 06:58
  • @Daniel - Now you are encountering a different problem. `k` isn't being updated atomically and so not all the bytes in your array are being written to. This also means that your results are skewed. – Omnifarious Jan 31 '11 at 15:32
  • @Daniel - I updated my answer with an example program that I know demonstrates the behavior you're looking for on my machine. I also added a caveat about using `volatile` because that keyword should not really be considered a real solution to thread problems. – Omnifarious Jan 31 '11 at 15:41
2

The main reason for observing the sequential filling of the output buffer is that you threads complete too soon. 200 iterations are not enough for the worker threads to be interrupted. Instead of printing output, I appended this to your main():

int switches = 0;
for (int i=1; i<iterations*4; ++i)
    if (output[i] != output[i-1])
        ++switches;
std::cout << "Switches = " << switches << "\n";

and started increasing iterations. Indeed, at some point (around 10,000,000 on my box) thread switches became apparent. That immediately revealed another problem with your program: increments of k are not atomic. You've declared k as volatile which means the compiler won't cache its value in a register when repeatedly checking it, like you do in the empty loop in main. However, volatile doesn't at all guarantee atomic increments. What does this mean in practice is when actual thread switches occur, k++ doesn't always increment and the while (k<iterations*4) {} loop will never terminate. To increment k atomically you need yet another external library - here's a discussion on the topic on SO. If you happen to be on Windows, it might be easier to use InterlockedIncrement directly. If you do those changes to your program, you'll see the result you are expecting.


There's also a subtle buffer overflow of output. Since you are incrementing k before writing to output, you are writing to indexes from 1 to iterations*4 included. You either need to initialize k to -1 or add one more element to the array

Community
  • 1
  • 1
sbk
  • 9,212
  • 4
  • 32
  • 40
  • +1 The `k++` non-atomic increment issue is the reason why you should never consider `volatile` as a primitive for multithreading. In a few cases (the check in main, assuming the increments were correct) it might be appropriate, but in most cases it will just not be enough. – David Rodríguez - dribeas Jan 31 '11 at 09:07
  • @David Rodriguez: My suggestion of `volatile` in my answer was a 'quick and dirty hack'. I should've been explicit that it isn't really a solution. – Omnifarious Jan 31 '11 at 15:43