C++ - Threads without coordinating mechanism like mutex_Lock

Question

I attended one interview two days back. The interviewed guy was good in C++, but not in multithreading. When he asked me to write a code for multithreading of two threads, where one thread prints 1,3,5,.. and the other prints 2,4,6,.. . But, the output should be 1,2,3,4,5,.... So, I gave the below code(sudo code)

mutex_Lock LOCK;
int  last=2;
int last_Value = 0;

void function_Thread_1()
{
      while(1)
      {
          mutex_Lock(&LOCK);
          if(last == 2)
          {
              cout << ++last_Value << endl;
              last = 1;
          }
          mutex_Unlock(&LOCK);
      }
}

void function_Thread_2()
{
      while(1)
      {
          mutex_Lock(&LOCK);
          if(last == 1)
          {
              cout << ++last_Value << endl;
              last = 2;
          }
          mutex_Unlock(&LOCK);
      }
}

After this, he said "these threads will work correctly even without those locks. Those locks will reduce the efficiency". My point was without the lock there will be a situation where one thread will check for(last == 1 or 2) at the same time the other thread will try to change the value to 2 or 1. So, My conclusion is that it will work without that lock, but that is not a correct/standard way. Now, I want to know who is correct and in which basis?

Shouldn't the latter thread compare `last` with 1 and set `last` to 2? — Joachim Isaksson, May 05 '13 at 11:07
I'm sure someone will write up a very technical answer, but sharing `last` between threads without a memory barrier (mutex operation) sounds like it may have _worse_ performance than using a mutex, and the other thread is not guaranteed to see the updated `last_Value` even if `last` has changed. — Joachim Isaksson, May 05 '13 at 11:15
I think if you remove the locks and use `std::atomic last = 2;`instead of `int last = 2` everything should work as expcted. — , May 05 '13 at 11:31
@ThePet: Yours is the only correct answer. You should write it up. (The interviewer is incorrect, but would be correct if you changed `last` to `atomic`.) — Wandering Logic, May 05 '13 at 11:38

score 3 · Answer 1 · answered May 05 '13 at 11:13

3

Without the lock, running the two functions concurrently would be undefined behaviour because there's a data race in the access of last and last_Value Moreover (though not causing UB) the printing would be unpredictable.

With the lock, the program becomes essentially single-threaded, and is probably slower than the naive single-threaded code. But that's just in the nature of the problem (i.e. to produce a serialized sequence of events).

answered May 05 '13 at 11:13

Kerrek SB

464,522
92
875
1,084

The interviewer said that the other while loop can't be executed until the running while loop change the 'last'. – prabhakaran May 05 '13 at 11:15
3

@prabhakaran: It's a data race and thus UB, no matter what the interviewer believes. See section 1.10 of the standard. You can trivially see that the load and the store of the `last` variable are racing. – Kerrek SB May 05 '13 at 11:25

score 3 · Answer 2 · 2013-05-05T12:27:47.163

I think the interviewer might have thought about using atomic variables.

Each instantiation and full specialization of the std::atomic template defines an atomic type. Objects of atomic types are the only C++ objects that are free from data races; that is, if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined. In addition, accesses to atomic objects may establish inter-thread synchronization and order non-atomic memory accesses as specified by std::memory_order.

[Source]

By this I mean the only thing you should change is remove the locks and change the lastvariable to std::atomic<int> last = 2; instead of int last = 2;

This should make it safe to access the last variable concurrently.

Out of curiosity I have edited your code a bit, and ran it on my Windows machine:

#include <iostream>
#include <atomic>
#include <thread>
#include <Windows.h>

std::atomic<int>    last=2;
std::atomic<int>    last_Value = 0;
std::atomic<bool>   running = true;

void function_Thread_1()
{
      while(running)
      {
          if(last == 2)
          {
              last_Value = last_Value + 1;
              std::cout << last_Value << std::endl;
              last = 1;
          }
      }
}

void function_Thread_2()
{
      while(running)
      {
          if(last == 1)
          {
              last_Value = last_Value + 1;
              std::cout << last_Value << std::endl;
              last = 2;
          }
      }
}

int main() 
{
    std::thread a(function_Thread_1);
    std::thread b(function_Thread_2);

    while(last_Value != 6){}//we want to print 1 to 6

    running = false;//inform threads we are about to stop

    a.join();
    b.join();//join

    while(!GetAsyncKeyState('Q')){}//wait for 'Q' press
    return 0;
}

and the output is always:

Ideone refuses to run this code (compilation errors)..

Edit: But here is a working linux version :) (thanks to soon)

I got the code to compile. You need to use .load() .store() on atomics (they don't overload "normal" int operations.) http://ideone.com/6iwxhv It throws on execution still. — Wandering Logic, May 05 '13 at 12:14
@WanderingLogic see [this](http://stackoverflow.com/questions/8649828/what-is-wrong-with-this-use-of-stdthread), ideone doesn't specify the -pthread command line argument I think.. — , May 05 '13 at 12:19
@WanderingLogic, the problem wasn't with `store`/`load` functions. `std::atomic last=2;`(and others) invokes a deleted copy-constructor. About throwing an exception - you could use, for example, [this service](http://coliru.stacked-crooked.com/view?id=f0d48fb2ffa598f6ebbb41e030e64a4d-7f4bc2c90970fdf19c48ebf8cc2bae3f) — awesoon, May 05 '13 at 12:21
@soon and The Pet: you are both so right, and I'm wrong! The overloads are there except for the deleted copy constructor. — Wandering Logic, May 05 '13 at 12:25
@The_Pet I dont think that the interviewer has the knowledge about atomic variables. He doesn't know even about 'trylock'. — prabhakaran, May 05 '13 at 15:35
@prabhakaran: I agree the interviewer is clueless. The point is he was _incorrectly_ assuming C++ guarantees sequential consistency. To get sequential consistency you need to declare those variables `atomic`. — Wandering Logic, May 05 '13 at 18:27

Grizzly · Accepted Answer · 2013-05-05T11:22:58.990

The interviewer doesn't know what he is talking about. Without the locks you get races on both last and last_value. The compiler could for example reorder the assignment to last before the print and increment of last_value, which could lead to the other thread executing on stale data. Furthermore you could get interleaved output, meaning things like two numbers not being seperated by a linebreak.

Another thing, which could go wrong is that the compiler might decide not to reload last and (less importantly) last_value each iteration, since it can't (safely) change between those iterations anyways (since data races are illegal by the C++11 standard and aren't acknowledged in previous standards). This means that the code suggested by the interviewer actually has a good chance of creating infinite loops of doing absoulutely doing nothing.

While it is possible to make that code correct without mutices, that absolutely needs atomic operations with appropriate ordering constraints (release-semantics on the assignment to last and acquire on the load of last inside the if statement).

Of course your solution does lower efficiency due to effectivly serializing the whole execution. However since the runtime is almost completely spent inside the streamout operation, which is almost certainly internally synchronized by the use of locks, your solution doesn't lower the efficiency anymore then it already is. Waiting on the lock in your code might actually be faster then busy waiting for it, depending on the availible resources (the nonlocking version using atomics would absolutely tank when executed on a single core machine)

I think 'busy wait' and 'spinlock' are the same. Will using of atomic increase the efficiency? or will it lower the efficiency? — prabhakaran, May 09 '13 at 07:26
@prabhakaran: That depends on what you call efficiency and what machine the code is running on. Using busy waiting can increase the overall speed, if the each thread has it's own core to run on (and no contention with other threads/processes for that one). The reason for this is that the latency introduced when switching between threads is much less when that switch only needs one check of an atomic as oposed to some os-calls, which might actually have to wakeup the thread. However busy-waiting will uselessly burn a lot of cpu power(the code is doing nothing useful half of the time while still — Grizzly, May 09 '13 at 10:16
blocking one cpu core. So it will decrease overall system performance as well as possibly increasing energy consumption. If the system doesn't have enough free cores (meaning that the threads havily content with other threads/processes for cpu resources, the atomic/busy waiting approach will be pretty bad. In that case the waiting thread will effectivly occupy resources the thread actually doing work could use, slowing down the working thread (and therefore the whole program). — Grizzly, May 09 '13 at 10:21

C++ - Threads without coordinating mechanism like mutex_Lock

3 Answers3