1

Less of a programming question, and more of an oddity that I'm seeking clarification on. Consider the following C program:

#include <stdlib.h>
#include <pthread.h>
#include <stdio.h>

volatile int counter = 0;

void incrementCounter()
{
    counter += 1;
}

void* threadfunc()
{
    for (int i = 0; i < 1000; i++)
    {
        incrementCounter();
    }
}

int main()
{
    pthread_t tids[100];
    printf("Creating Threads...\n");
    for (int i = 0; i < 100; i++)
    {
        pthread_create(&tids[i], NULL, threadfunc, NULL);
    }

    printf("Joining Threads...\n");

    for (int i = 0; i < 100; i++)
    {
        pthread_join(tids[i], NULL);
    }

    printf("Finished. Counter = %d\n", counter);
}

This was what I wrote for a college assignment. Its supposed to show the dangers of multiple threads not locking when writing to variables.

I'm on Windows 10, so I open up my installation of Ubuntu Bash, and run

$ gcc -std=c99 -pthread main.c
$ ./a.out

It prints

Creating Threads...
Joining threads...
Finished. Counter = 100000

Ok... that's correct. It shouldn't be correct. This is supposed to be broken!

I run it again and again with the same results. Counter = 100000 each and every time! This may be the only time in my life that I'm disappointed my code works correctly.

So I log onto my schools shared Linux system for CS students. Pull my code, execute it the same way and I get:

Creating Threads...
Joining threads...
Finished. Counter = 99234

Next time, Counter = 99900, then Counter = 100000, then Counter = 99082

That's what I was expecting!

So my question:

What gives? What is it about the Linux Subsystem for Windows that causes my code not to break?

Adam Schiavone
  • 2,412
  • 3
  • 32
  • 65
  • Is your installation of `Ubuntu` running in a virtual machine with a single CPU? – Lou Aug 18 '17 at 20:59
  • @Lou Linux Subsystem For Windows is not a virtual machine. – n. m. could be an AI Aug 18 '17 at 21:06
  • Standard install. Running on an i7 with 4 physical cores. – Adam Schiavone Aug 18 '17 at 21:07
  • @n.m.: just noticed, I had no idea this thing existed until now. Presumably it does *some* thread virtualization under the hood, at least according to OP's observations. Perhaps it would be a good idea to [post a comment here](https://blogs.msdn.microsoft.com/wsl/2016/05/23/pico-process-overview/) to get details about the inner workings. – Lou Aug 18 '17 at 21:07
  • 1
    *"I'm disappointed my code works correctly."* - an impossibility, as the code itself is not correct. I suggest you use a smaller pool of threads, and a *much* larger count up target. It is completely feasible that each thread your launching is already finished with its loop and terminating before the next thread from `main` is even spun up. And/or/also latch the beginning of each thread with a global mutex that is initially latched in `main`, and released only when all threads are started, with each thread releasing it as soon as they acquire it prior to starting their loop. – WhozCraig Aug 18 '17 at 21:12
  • Undefined behaviour is *undefined*. This means no guarantees whatsoever. In particular, a result that looks OK is a permissible outcome for a program that exhibits UB. You cannot expect a broken program to demonstrate a spectacular explosion for you each time you run it. – n. m. could be an AI Aug 18 '17 at 21:13
  • https://github.com/Microsoft/BashOnWindows/issues - is a good place to start research your issue. – vadim_hr Aug 18 '17 at 21:15
  • Possible duplicate of [Undefined, unspecified and implementation-defined behavior](https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior) – n. m. could be an AI Aug 18 '17 at 21:15
  • From what I've seen in the [WSL](https://blogs.msdn.microsoft.com/wsl/) blog, it appears that it's very conservative about creating a large number of threads and spawning them ahead of time, so what @WhozCraig wrote is likely true. – Lou Aug 18 '17 at 21:15
  • Yup - a loop to 1000 is next-to-nothing. – Martin James Aug 18 '17 at 21:59

1 Answers1

1

It is still broken. It just happened to work. You have no guarantees that the code will run correctly or incorrectly. You [still] have a race condition between the threads.

Herein, I'm using two threads: A and B. But, the example works for more threads.

On linux, you got:

thread A        thread B
-------------   --------------
fetch
                fetch
inc
                inc
store
                store

Here, thread B's store will trash thread A's increment and store. That is, if the original value was 5, you'll get 6 instead of the [desired] 7.

Under Windows, you got:

thread A        thread B
-------------   --------------
fetch
inc
store
                fetch
                inc
                store

But, the difference is just the OS scheduler and its policies. Without locking, there is no guarantee.

Under Windows, it started thread A, which ran to completion, before thread B got started and ran, so you get the "disjoint" behavior.

Try with a larger [or smaller] number of threads and a much larger thread loop value and you should see some different behavior

Craig Estey
  • 30,627
  • 4
  • 24
  • 48
  • 1
    I am pretty sure OP already knows all this, the question was *why* the Ubuntu-on-Windows version doesn't exhibit the behavior. And there are 100 threads involved, not "thread A and B". – Lou Aug 18 '17 at 21:12
  • @Lou Thanks for pointing that out. Yes I understand that the code is horribly broken -- the assignment was to write horribly broken code. – Adam Schiavone Aug 18 '17 at 21:55
  • @Craig Estey your suggestion was spot on. I upped it to 1000 threads counting to 10000 and I started seeing incorrect results. It was very infrequent -- maybe 5% of the time. – Adam Schiavone Aug 18 '17 at 21:57