3

I know the C++ standard doesn't guarantee anything in presence of a data race (I believe a data race has undefined behavior, meaning anything goes, including program termination, modifying random memory, etc...).

Is there any architecture where a data race that consists of one thread writing to a memory location and one thread reading from the same location (without synchronization) doesnt result in the read operation reading an undefined value and where the memory location is "ultimately" (after a memory barrier) updated to the value that was written by the write operation?

[edited to replace "race condition" with "data race"]

anonymous
  • 471
  • 4
  • 11
  • If you are using the file system or a memory mapped file then reading and writing in the same process using two file handles instead of one seems to reduce race conditions. – Ross Bush Mar 30 '14 at 16:55
  • Why do you even care? Such a program would be terrible, because you can never safely port it to another architecture. Just don't. Use `std::mutex` or atomics. It's really easy _and_ safe. – stefan Mar 30 '14 at 17:43
  • The reason I care is because I'm planning to memory map a region of a file. There is a data structure that indicates which portion of the file is referenced and which is not. I have readers on one thread reading referenced regions of the file and writers on another writing to unreferenced region of the file. I want to understand what would happen if the file is corrupt and a writer write in a region being read by a reader (I want a minimum of guarantees as to what might happen, ie no crash say, no security issues, ...) – anonymous Mar 30 '14 at 18:30

3 Answers3

6

The problem with data races is not, that you can read a wrong value on a machine level. The problem with data races is, that both compiler and processor perform a lot of optimizations on the code. To make sure that these optimizations are correct in the presence of multiple threads, they need additional information about variables that can be shared between threads. Such optimizations can for example:

  • reorder operations
  • add additional load and store operations
  • remove load and store operations

There is a good paper benign data races by Hans Boehm called How to miscompile programs with "benign" data races. The following excerpt is taken from this paper:

Double checks for lazy initialization

This is well-known to be incorrect at the source-code level. A typical use case looks something like

if (!init_flag) {
    lock();
    if (!init_flag) {
        my_data = ...;
        init_flag = true;
    }
    unlock();
}
tmp = my_data;

Nothing prevents an optimizing compiler from either reordering the setting of my_data with that of init_flag, or even from advancing the load of my_data to before the first test of init_flag, reloading it in the conditional if init_flag was not set. Some non-x86 hardware can perform similar reorderings even if the compiler performs no transformation. Either of these can result in the final read of my_data seeing an uninitialized value and producing incorrect results.


Here is another example, where int x is a shared and int r is a local variable.

int r = x;
if (r == 0)
    printf("foo\n");
if (r != 0)
    printf("bar\n");

If we would only say, that reading x leads to an undefined value, then the program would either print "foo" or "bar". But if the compiler transform the code as follows, the program might also print both strings or none of them.

if (x == 0)
    printf("foo\n");
if (x != 0)
    printf("bar\n");
nosid
  • 48,932
  • 13
  • 112
  • 139
  • In the double check example, the compiler is free to reorder things. Wouldnt that be equivalent to it assuming that init_flag returns true always after the data race? Wouldnt saying that the program behaves as if the read operations read undefined value enough? – anonymous Mar 30 '14 at 17:35
0

you can use linux OS where you can fork a 2 or more child process over a parent process in c++,you can make both to access one memory location and , by using synchronization you can achieve what you wanna do.--> How to share memory between process fork()? , http://en.wikipedia.org/wiki/Dekker's_algorithm , http://en.wikipedia.org/wiki/Readers%E2%80%93writers_problem ,

Community
  • 1
  • 1
Saurabh Saluja
  • 190
  • 1
  • 9
0

One example that will always result in a race location: ask two threads to write a different value to the same variable. Let's assume that

  • thread one sets variable a to 1
  • thread two sets variable a to 2

You will get race condition, even with a mutex for example because

  • if thread one is executed first then you get a=1 then a=2.
  • if thread two is executed first then you get a=2 then a=1.

The order of the threads is depending on the os and there is not guratantee about which thread will be first. Otherwise it would be sequential and no need to do it in separate threads.

Assume now that you have not synchronisation at all and you are doing a=a+1 in the first thread a=a+2 in the second thread. The inital value of a is 0.

In assembly the code being generated is copy the value of a into one register, add 1 to it (in the case of the first thread, 2 otherwise).

If you have no synchronization at all you can have the following order for example

  • Thread1: value of a copied to reg1. reg1 contains 0

  • Thread2: value of a copied to reg2. reg2 contains 0

  • Thread1: value of reg1 added 1. Now contains 1

  • Thread2: value of reg2 added 2. Now contains 2

  • Thread1: value of reg1 added 1. Now contains 1

  • Thread2: value of reg2 added 2. Now contains 2

  • Thread1: value of reg1 put to a. Now a contains 1

  • Thread2: value of reg2 put to a. Now a contains 2

If you have thread1 executed then sequentially thread 2 you would have a=3 at the end.

Now imagine a is a pointer, ie an adressm so as you know, getting a wrong pointer adress can cause the program to crash. So a wrong synchronization can cause the program to crash.

Makes sense?

Gabriel
  • 3,564
  • 1
  • 27
  • 49
  • Yes, that makes sense. My question is whether the data race can result in something else than "a" getting a weird value (can it result in a program termination, in other random variables being updated, etc... in true undefined behavior) – anonymous Mar 30 '14 at 17:39
  • 1
    *"You will gt undefined behaviour, even with a mutex"* no, you won't. UB has a very specific meaning in c++, namely that ANYTHING can happen. If you use mutexes or atomics to guard the access to `a`, then you might hava a *race condition* in the sense, that the output depends on the execution speed (whichever thread comes last will win), but you woun't hav a *data race* and hence, no *undefined behavior* - you will end up with one of two values. – MikeMB Apr 21 '15 at 17:39