1

I wrote a small test project to see if std::call_once blocks while executing callable. Output of the project allows to assume that call_once has 2 behaviours: it blocks on detached threads and does not on joined. I strongly suspect that it can not be true, but there is no other conclusion I can make, please guide me to the correct one.

using namespace std;
once_flag f;
mutex cout_sync;

void my_pause() 
{
  volatile int x = 0;
  for(int i=0; i<2'000'000'000; ++i) { x++; }
}

void thr(int id) 
{
  auto start = chrono::system_clock::now();
  call_once(f, my_pause);
  auto end = chrono::system_clock::now();
  scoped_lock l{cout_sync};
  cout << "Thread " << id << " finished in " << (static_cast<chrono::duration<double>>(end-start)).count() << " sec" << endl;
}

int main() 
{
  vector<thread> threads;
  for(int i=0; i<4; i++)
  {
    threads.emplace_back(thr, i);
    threads.back().join();
  }      
  return 0;
}

Output:

Thread 0 finished in 4.05423 sec
Thread 1 finished in 0 sec
Thread 2 finished in 0 sec
Thread 3 finished in 0 sec

Changing threads to detached:

for(int i=0; i<4; i++)
{
  threads.emplace_back(thr, i);
  threads.back().detach();
}
this_thread::sleep_for(chrono::seconds(5));

Output:

Thread 0 finished in 4.08223 sec
Thread 1 finished in 4.08223 sec
Thread 3 finished in 4.08123 sec
Thread 2 finished in 4.08123 sec

Visual Studio 2017

  • 4
    Are you sure you want to join the threads in the loop just after you started them? – Some programmer dude Aug 08 '19 at 12:20
  • I dont see how this can be the root of the problem, clarify please. – Problem Sir Aug 08 '19 at 12:22
  • And with the detached threads, you might want to increase the time in the `sleep_for`, A little different workload on your system and you might otherwise exit the process (and kill the threads) before one or more threads are finished. – Some programmer dude Aug 08 '19 at 12:22
  • 1
    If it was, I would have written an answer. It's merely a comment on you not running the threads in parallel, only serial (i.e. it's no different than plain calling `thr` as a normal function in the loop). – Some programmer dude Aug 08 '19 at 12:23
  • 1
    @Someprogrammerdude I think that *is* the root of the problem though -- isn't it? In the code that uses `join` the `main` thread blocks when calling `join` on the 1st thread started but that thread is blocked on the `call_once` invocation. Subsequent threads are therefore unaffected by `call_once`. I think. – G.M. Aug 08 '19 at 12:28
  • related: [Is std::call_once reentrant and thread safe?](https://stackoverflow.com/questions/22692783/is-stdcall-once-reentrant-and-thread-safe) – Lasall Aug 08 '19 at 12:32
  • You test is incorrect. By joining thread right after start you do not start consecutive threads until `my_pause()` executed once. All other threads don't wait for anything since `call_once()` is passive. In detach case all threads start at about same time, one executing `my_pause()`, while other wait end of execution. – sklott Aug 08 '19 at 12:33
  • Wow that really is, thanks you guys! – Problem Sir Aug 08 '19 at 12:33

2 Answers2

7

It is in fact related to the fact that you join the thread first, before starting the next thread, in the joined version.

These semantics are triggered because of the specification of call_once:

If that invocation throws an exception, it is propagated to the caller of call_once, and the flag is not flipped so that another call will be attempted (such call to call_once is known as exceptional).

This means that if the call_once'd function throws an exception, it is not considered to be called, and the next call to call_once will invoke the function again.

This means that the entire call_once() is effectively protected by an internal mutex. If a call_once-d function is being executed, any other thread that enters call_once() must be blocked, until the call_once-d function returns.

You join the threads one at a time, so the 2nd thread doesn't get called until call_once already returned, in the first thread.

You start all four detached threads effectively at the same time. Effectively, all four threads will enter call_once approximately together.

One of those threads will end up executing the called function.

The other threads will be blocked until the called function either returns, or throws an exception.

This effectively means that all threads will have to wait.

This has nothing to do with detached threads.

If you change the first version of the code to start all four threads first, and then join them all, you'll see the same behavior.

Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
3

Different isn’t the same.

The non-detached threads are being run sequentially — the code waits until one thread finishes before it launches the next. So the first one hits the wait loop and the others don’t.

The detached threads run simultaneously. One of them runs the wait loop and the others block until the wait loop is finished.

Change the code for the non-detached threads to run them simultaneously. To do that, move the join outside of the loop that creates the threads.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165