9
int f(int);

Multiple threads can call this function. The function should return

argument * argument_used_in_first_call_to_function

I have coded as below. Even though it is thread-safe it is not fast as it uses mutex lock/unlock. Is there a faster solution while still being thread safe?

mutex mut1;
int f(int x)
{
  pthread_mutex_lock(mut1);
  static bool first_init = true;
  static int first_arg = 0;

  if (first_init)
  {
    first_arg = x;
    first_init = false;
  }
  pthread_mutex_unlock(mut1);
  return x * first_arg;
}
Medicine
  • 1,923
  • 2
  • 23
  • 33

3 Answers3

12

IF you have a c++11 compliant compiler
(e.g. NOT VS2013)

Both, the easiest and most efficient way is to just write:

int f(int x) {
    static int firstArg = x;
    return firstArg*x;
}

The c++11 standard requires that initialization of function local static variables is thread safe *). To be more precise, it requires that only one thread initializes the variable and that all other threads wait, until the initialization is completed (later reads and writes can of course still race, but as this is the only write access to firstArg, no additional synchronization is required here).

If your compiler doesn't support "magic statics"

The next best method is to use std::call_once as suggested by Sebastian Redl, which has the same semantics.

If initialization via std::call_once is too slow (it probably uses a mutex) and arg is a built-in type (like int), you can try the following (I didn't do any measurements):

namespace {
    const int DISALLOWED_VALUE = std::numeric_limits<int>::max();
    std::atomic<int> firstArg= DISALLOWED_VALUE;
}

int f(int x) {
    if (firstArg.load(std::memory_order_relaxed) == DISALLOWED_VALUE) {
        int tmp = DISALLOWED_VALUE;
        firstArg.compare_exchange_strong(tmp, x);
    }   
    return firstArg.load(std::memory_order_relaxed)*x;
}   

DISALLOWED_VALUE is some value that can not possibly be passed to f as a valid argument. In this case std::numeric_limits<int>::max() would cause integer overflow when multiplied with itself, so it is not a valid argument for f and can thus serve as an indicator that firstArg hast not been initialized yet.

Warning: Only use this if you have verified, that std::call_once is unacceptably slow for your particular workload (which will almost never be the case) and that this version is actually a sufficient improvement.

Note on conditional locking

As there are/were some answers that proposed various faulty conditional locking algorithms, I also put up a correct manual implementation of double checked locking.

namespace { 
    std::atomic<bool> isInit = false; //has to be atomic
    std::mutex mux;
}
int f(int x) {
    static int firstArg;
    if (!isInit.load(std::memory_order_acquire)) {
        std::lock_guard<std::mutex> lg(mux);
        if (!isInit.load(std::memory_order_acquire)) {
            firstArg = x;
            isInit.store(true,std::memory_order_release);
        }
    }
    return firstArg*x;
}

The two important parts are:

  • Do double checking (once inside and once outside of the protected region)
  • Use an std::atomic for the flag. Otherwise, the order, in which threads that don't lock observe the stores to the flag and the variable, is not guaranteed.

Acknowledgements:
The double checked locking version is based on the presentation by Herb Sutter on cppcon2014 and was augmented based on the comments/answers by EOF and Sebastian.


*) See e.g. this question and from the latest working draft of the c++14 standard (6.7 point 4):

If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.

Community
  • 1
  • 1
MikeMB
  • 20,029
  • 9
  • 57
  • 102
  • Wrong: `static` data is process-wide so is common to all threads. – Basile Starynkevitch Sep 04 '15 at 05:13
  • 4
    @BasileStarynkevitch: That is what tho OP wanted, no? The same value for alll threads. The important thing ist that the initizalization is threadsafe in c++11: http://stackoverflow.com/questions/8102125/is-local-static-variable-initialization-thread-safe-in-c11. So there is no data race, even if the above function is accessed by multiple thrads at the same time. – MikeMB Sep 04 '15 at 05:14
  • @MikeMB I believe you are correct (although you should be returning `firstArg * x`). – The Dark Sep 04 '15 at 05:15
  • @BasileStarynkevitch: I added some additional explanation, I hope my point is clear now. If you still think I made an error - or missunderstood the question - please provide some additional explanation. – MikeMB Sep 04 '15 at 05:38
  • @BasileStarynkevitch I understand that this is want OP wanted. If not, `static thread_local int firstArg = x;` would do the trick. – sbabbi Sep 04 '15 at 09:46
  • Magic statics were introduced in the just-released Visual Studio 2015. – Sebastian Redl Sep 04 '15 at 10:12
  • You have a race here, the store to `isInit` can happen before the store to `firstArg`. – Daniel Sep 04 '15 at 10:21
  • @Dani: How? `isInit` is an atomic – MikeMB Sep 04 '15 at 10:29
  • @SebastianRedl: Thanks! – MikeMB Sep 04 '15 at 10:30
  • I'd recommend making the first `if (!isInit)` use a memory_order_aquire load. By default, it's sequential consistency, which is unnecessary, AFAICT. – EOF Sep 04 '15 at 10:31
  • @EOF: Good suggestion. Can/should one also make the store `memory_order_release` then? And why only the first one? Relaxed memory orderings aren't my strong suit. – MikeMB Sep 04 '15 at 10:33
  • Quite possibly. Actually, you might even be able to drop the first load to memory_order_relaxed, but don't quote me on that. As to why I only care about the first load: Well, that's the only one that matters in the long run, once initialization is complete and visible to all threads. – EOF Sep 04 '15 at 10:35
  • @EOF: Why did you delete your solution? – MikeMB Sep 08 '15 at 13:50
  • @MikeMB: I just realized you'd already found this idea. I hadn't checked your latest edits. – EOF Sep 08 '15 at 14:39
  • @EOF: Well, it wasn't quite the same. Although I'm not sure if that would actually affect performance, I found the idea interesting, to use a local, non-atomic temporary, to save an atomic load. Also, I've to admit, that - due to an oversight - the first load in my solution had an aquire ordering before I read your post - thanks for that. – MikeMB Sep 08 '15 at 15:05
6

Mike's magic static answer is the best if your compiler supports them. If you're on Visual Studio 2013, the best way to do it is to use std::call_once, not custom flags and mutexes.

#include <mutex>
namespace {
  std::once_flag fFirstCallFlag;
}
int f(int arg) {
  static int firstValue;
  std::call_once(fFirstCallFlag, [&firstValue, arg] { firstValue = arg; });
  return firstValue * arg;
}
Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157
  • Are you sure, that `firstValue` can be a local static? I thought, if you don't explicitly initialize them, they get implicitly initialized to zero upon the first time, the control flow passes through them. – MikeMB Sep 04 '15 at 10:21
  • 1
    No, that implicit zero-initialization is static, not dynamic, so it happens at program load. If it was an object with a constructor, or the initial value required dynamic computation, then you would be right. – Sebastian Redl Sep 04 '15 at 11:00
  • Right, I forgot about that (as I did about call_once). Thanks for the info. – MikeMB Sep 04 '15 at 11:45
  • Do you know what is the case for `std::atomic`? – MikeMB Sep 04 '15 at 12:10
  • It's theoretically safe (it's required to have a constexpr value constructor, which means initialization should be static), but I'm not sure about Visual Studio 2013, which doesn't support constexpr. – Sebastian Redl Sep 04 '15 at 12:22
1

If you still consider implementation with some lock, try spinlock. It is keeping the thread in the user space with spin, and is not making switch to kernel space if operations are fast

#include "boost/smart_ptr/detail/spinlock.hpp"

boost::detail::spinlock lock;

bool first_init = true;
int first_arg = 0;

int f(int x)
{
  std::lock_guard<boost::detail::spinlock> guard(lock);
  if (first_init)
  {
    first_arg = x;
    first_init = false;
  }
  return x * first_arg;
}