safe way to get a singleton instance from multithreads

Question

Method 1

DataCenter* DataCenter::getInstance()
{
    static DataCenter instance;
    return &instance;    
}

Method 2

DataCenter* DataCenter::getInstance()
{
    if (!m_instanceFlag)
    {
        m_instance = new DataCenter();
        m_instanceFlag = true;
    }    
    return m_instance;      
}

I am working on a multi threaded programming and DataCenter will be accessed by more than one thread. I used to have method 2 to get the instance of DataCenter and it worked fine. But I noted that I need to guard the singleton instance from being called by multi threads.

My question is first do I really need to guard the singleton instance? or does OS do this for me? Second questions is that, is the first method a right way to get the singleton instance?

Thanks in advance...

Interesting read: [C++ singleton design pattern](http://stackoverflow.com/questions/1008019/c-singleton-design-pattern) — Mat, Jul 08 '11 at 05:40
Very related: http://stackoverflow.com/questions/449436/singleton-instance-declared-as-static-variable-of-getinstance-method/449823#449823 — Mat, Jul 08 '11 at 05:41
If your DataCenter instance modifies anything / is modified, you will have to make it thread safe. Making the singleton method thread safe will not accomplish this since multiple threads could all get a reference to the class and then freely use it. I suggest you take a look at the boost threading library it makes threading much more manageable. — GWW, Jul 08 '11 at 05:42
What platform? There is no such thing as a "thread" in C++ prior to C++0x, so you must specify OS and compiler for this question even to _mean_ anything. — Nemo, Jul 08 '11 at 05:50
What about simply creating the `DataCenter` instance at file scope, and returning a pointer to it in the `getInstance()` function? That ensures the object is constructed prior to the program executing. — Praetorian, Jul 08 '11 at 06:13
@Praetorian: That works, as does making it a static member variable of the class. However, if its constructor requires other objects to be initialized, you run into the "order of initialization problem" for global objects. But if it is a simple class, just making it global works fine. — Nemo, Jul 08 '11 at 06:36
What **is** the DataCenter really, and why is it a singleton? And what do you propose to do with it in the multiple threads? — Karl Knechtel, Jul 08 '11 at 08:32
See also http://stackoverflow.com/questions/6086912/double-checked-lock-singleton-in-c11 (especially the second answer about the "Meyers singleton") — Nemo, Jul 10 '11 at 23:54

Mihran Hovsepyan · Accepted Answer · 2011-07-08T06:08:34.673

5

1.You do need to guard it and even if you don't, of course, OS wouldn't do it for you. Use following code for thread-safety:

DataCenter* DataCenter::getInstance()
{
    MutexLocker locker(DataCenter::m_mutex);
    if(!m_instanceFlag)
    {
        m_instance = new DataCenter();
        m_instanceFlag = true;
    }
    return m_instance;
}

Edit:

Where MutexLocker is something like this:

class MutexLocker
{
    pthread_mutex_t &mutex;
    public:
    MutexLocker(pthread_mutex_t &mutex):mutex(mutex)
    {
        if(pthread_mutex_lock(&this->mutex)!=0)
            throw std::runtime_error("mutex locking filed");
    }
    ~MutexLocker(void)
    {
        if(pthread_mutex_unlock(&this->mutex)!=0)
            throw std::runtime_error("mutex unlocking filed");
    }
}

2.First method looks ok, but not thread-safe.

edited Jul 08 '11 at 06:08

answered Jul 08 '11 at 05:45

Mihran Hovsepyan

10,810
14
61
111

Thanks for your answer. What is MutexLocker and m_mutex btw? Is this a third party library? Is there any other way using pthread mutex lock? – user800799 Jul 08 '11 at 06:00
No @user800799 this is not third party library, it can be written yourself as well. It is some class which locks the mutex when creating its instance and unlocks the mutex when its destructor is called (in this case when function ends). – Mihran Hovsepyan Jul 08 '11 at 06:04
First method is thread-safe on Linux using g++. It really depends on your compiler... – Nemo Jul 08 '11 at 06:10
@Mihran Hovespyan, So the m_mutex should be static? – user800799 Jul 08 '11 at 06:38

score 2 · Answer 2 · edited Nov 22 '12 at 12:12

I suppose I might as well provide my answer...

First, "Method 1" works fine on C++0x. From the draft standard, section 6.7 (4) (emphasis mine):

The zero-initialization (8.5) of all block-scope variables with static storage duration (3.7.1) or thread storage duration (3.7.2) is performed before any other initialization takes place. ... Otherwise such a variable is initialized the first time control passes through its declaration; such a variable is considered initialized upon the completion of its initialization. ... If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.

So if you have C++0x, "Method 1" is a simple, correct, and 100% portable way to implement a thread-safe singleton. (Even prior to C++0x, g++ on Unix ensures this idiom will work. I do not know about MSVC.) This is also very likely to be the fastest solution, since the compiler can see exactly what you are doing and it knows more than you about your CPU's architecture.

The easy way to fix "Method 2" is to put a mutex around the entire function, as Mihran suggests. But that might have unfortunate performance consequences, so people are always looking for ways to optimize it. Most of those ways introduce subtle bugs...

The "double-checked locking pattern" is the classic example. It looks like this:

if (m_instance == NULL) {
    grab_mutex();
    if (m_instance == NULL)
        m_instance = new Whatsit();
    release_mutex();
}

return m_instance;

There are two problems with this pattern. First, individual memory accesses are not guaranteed to be atomic in general; simultaneous loads and stores of a single memory location by different threads can result in garbage being read. (I grant this is unlikely for a pointer value -- and certainly it will not happen on x86/x86_64 -- but do you really want your code to work on just one platform for one day?)

Second, both the compiler and the CPU are free to reorder memory accesses. So the thread that runs the Whatsit constructor and then fills in m_instance might actually perform those writes out-of-order... Meaning another thread can test m_instance, observe it to be non-NULL, and then access the object before it has been initialized. This is not hypothetical; it really happens on modern CPUs. Worse, the more modern the CPU, the more likely it is to be a problem, because CPUs keep getting more and more aggressive about reordering memory accesses.

To fix this pattern, the first read of m_instance needs to have "acquire semantics", and the write to m_instance needs to have "release semantics". Definitions: If a memory load L has "acquire semantics", then subsequent loads may not be reordered to happen before L. Similarly, if a store S has "release semantics", then previous stores must not be reordered to happen after S.

Both of these are required in order for "double-checked locking" to be thread-safe. (Plus the individual loads and stores must be atomic.)

As Justin points out, declaring m_instance "volatile" will provide these semantics on MSVC. But this is not guaranteed by any standard, and in fact it is not true for many compilers (e.g. g++ on Unix). But if you are certain you will never care about any platform other than x86 Windows -- which you aren't -- then "volatile" will work.

What about using a compare-and-swap operation like InterlockedCompareExchangePointer?

if (m_instance == NULL) {
    Whatsit *p = new Whatsit();
    if (InterlockedCompareExchangePointer(&m_instance, p, NULL) != NULL)
        delete p;
}

return m_instance;

The documentation says InterlockedCompareExchangePointer provides a "full memory barrier", which means it definitely has release semantics. So is this correct?

No, not in general. The problem is that the outer read of m_instance does not necessarily have "acquire semantics". So in principle, another thread could still read the members of the object before they have been initialized. Modern CPUs do perform "speculative loads"; that is, they try to "guess" what you are going to need in order to load it from memory ahead of time. Granted, on a cache-coherent platform like x86, the CPU would have to be near-psychic to read an object before its pointer had even been computed... But if you use this pattern just because it happens to work on your system today, when it breaks on some future system and introduces night-impossible-to-debug failures, somebody will be cursing your name. So please do not do this.

Finally, you can use an API designed for one-time initialization, like the "One-Time Initialization API" on Windows or pthread_once on Unix. These work correctly by definition, but obviously they are platform-specific.

Bottom line: Use "Method 1" if your system supports it for this purpose.

Your `InterlockedCompareExchangePointer` example is flawed: a thread may enter the `if` statement after `m_instance` is already initialised and fail the `Interlocked` operation. It will correctly clean up `Whatsit`, but it may then return `NULL` unless `m_instance` is marked `volatile` due to compiler optimisation. **EDIT:** sorry, I see you've already commented on this. — Blair Holloway, Jul 09 '11 at 10:28
I do agree that Method 1 is in fact generally a better choice. `Interlocked` operations do tend to get a bit dicey. :) — Blair Holloway, Jul 09 '11 at 10:29
And what if even the first m_instance test is also made with an InterlockedCompareExchangePointer(&m_instance, m_instance, NULL)==NULL test? Would this work? — xMRi, Feb 28 '17 at 09:15

Chris Bednarski · Answer 3 · 2011-07-08T06:26:56.967

As people mentioned in comments, the double-checked lock is not a thread safe solution. You really need to use some kind of a mechanism that will serialize access to the resource. Interlocked exchange is probably one of the simplest ways.

template <typename T>
class Singleton
{
  private:
    Singleton();
    ~Singleton();
    Singleton& operator=(const Singleton& item);

  protected:
    static volatile long m_locker;

    static T* GetPointer()
    {
      T* pTmp = NULL;
      try
      {
         static T var;
         pTmp = &var;
      }
      catch(...)
      {
         //assert(false);
         pTmp = NULL;
      }
      return pTmp;
    }

  public:
    static T* Get()
    {
       while(::InterlockedExchange(&m_locker, 1) != 0)
         ::SwitchToThread();

       T* pTmp = GetPointer();
       ::InterlockedExchange(&m_locker, 0);
       return pTmp;
    }
};

template <typename T>
  volatile long Singleton<T>::m_locker = 0;

score 1 · Answer 4 · answered Jul 08 '11 at 06:22

1

You only need to guard the singleton access if calling getSingleton may also initialise the singleton -- otherwise, multiple threads may try to create it simultaneously.

A mutex is sufficient to prevent the race condition, however, each subsequent call to getSingleton must also acquire the lock, which puts a damper on performance. If this is an issue and you can deal with the additional complexity, Raymond Chen shows a way that the lock can be avoided, by allowing multiple threads to create the singleton instance and determining which to keep using interlocked operations (code inlined below):

Widget *g_pwidCached;

Widget *GetSingletonWidget()
{
 Widget *pwid = g_pwidCached;
 if (!pwid) {
  pwid = new(nothrow) Widget();
  if (pwid) {
   Widget *pwidOld = reinterpret_cast<Widget*>
       (InterlockedCompareExchangePointerRelease(
          &reinterpret_cast<PVOID&>(g_pwidCached),
          pwid, NULL));
   if (pwidOld) {
    delete pwid; // lost the race - destroy the redundant copy
    pwid = pwidOld; // use the old one
   }
  }
 }
 return pwid;
}

Of course, this is Windows-specific, but the code can be replaced with platform-specific interlocked operations without changing the meaning. (As a bonus: if you are coding on Windows, you can simply use the provided One-Time Initialization API to do the hard work for you!)

Note that the singleton's constructor must not have side-effects, otherwise you will get unexpected results. (There are more details in Raymond's full blog post.)

answered Jul 08 '11 at 06:22

Blair Holloway

15,969
2
29
28

1

The equivalent on Unix systems is [pthread_once](http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_once.html) – Nemo Jul 08 '11 at 06:31
Also, the outer read of `pwid` is not necessarily ordered, so this code is not thread-safe (i.e., Raymond Chen is wrong). – Nemo Jul 08 '11 at 15:27
I don't believe this is an issue: if another thread creates the singleton and writes to `g_pwidCached`, then yes, another thread may still come in and read `NULL` from it, due to ordering issues. That thread will then go on to create the singleton, fail the `InterlockedCompareExchangePointerRelease`, and gracefully recover. – Blair Holloway Jul 09 '11 at 10:24
That is not the problem. The problem is that a thread might "speculatively read" from the object _before it even checks the pointer_. So thread A speculatively reads uninitialized fields, then thread B initializes the object and pointer, then thread A observes the pointer to be non-NULL and uses the uninitialized data. Unlikely? Yes. Impossible? No. The `if (pwid)` test _must_ use "acquire semantics", which it does not in this example. So this example is technically wrong. – Nemo Jul 09 '11 at 15:08
I'd like to learn more about this effect, if I could. Do you have any links/references handy? :) – Blair Holloway Jul 10 '11 at 23:44
I do not know of a good overview... Unfortunately a search on "speculative loads" does not turn up much of interest. But see [the accepted answer to this question](http://stackoverflow.com/questions/6086912/double-checked-lock-singleton-in-c11) for this specific case. Or any example of the double-checked pattern "corrected" for C++0x, like [this](http://www.justsoftwaresolutions.co.uk/threading/multithreading-in-c++0x-part-6-double-checked-locking.html). – Nemo Jul 10 '11 at 23:53

Alok Save · Answer 5 · 2011-07-08T05:56:12.913

0

You need to use double checked lock mechanism but it may not be 100% safe as well.

DataCenter* DataCenter::getInstance()
{
  if (m_instance == null)
  {
    // some sort of synchronization lock    //1
     {  

        if (m_instance == null)             //2
            m_instance  = new DataCenter(); //3 
    }
  }
  return m_instance ;
}

A little more explanation:

Thread 1 enters the getInstance() method.

Thread 1 enters the synchronized block at //1 because instance is null.

Thread 1 is preempted by thread 2.

Thread 2 enters the getInstance() method.

Thread 2 attempts to acquire the lock at //1 because instance is still null. However, because thread 1 holds the lock, thread 2 blocks at //1.

Thread 2 is preempted by thread 1.

Thread 1 executes and because instance is still null at //2, creates a Singleton object and assigns its reference to instance.

Thread 1 exits the synchronized block and returns instance from the getInstance() method.

Thread 1 is preempted by thread 2.

Thread 2 acquires the lock at //1 and checks to see if instance is null.

Because instance is non-null, a second Singleton object is not created and the one created by thread 1 is returned.

edited Jul 08 '11 at 05:56

answered Jul 08 '11 at 05:40

Alok Save

202,538
53
430
533

4

NO. The "double-checked lock" pattern (actually anti-pattern) is _not_ thread-safe. It can be made thread-safe using C++0x memory barrier primitives, but that is far more complex than the example you give. This kind of "99% solution" is the worst thing you can do as a software engineer. – Nemo Jul 08 '11 at 05:48
Err, either that `instance` is different in which case what's it for? Or it's actually m_instance in which case it gives you nothing the sync lock doesn't already give. Or am I missing something here? – paxdiablo Jul 08 '11 at 05:48
@Nemo: Care to ellaborate on the comment as to how and why not? – Alok Save Jul 08 '11 at 05:49
@Als: Try a Google search... But briefly, both the compiler and CPU can freely re-order memory accesses _except_ across synchronization barriers (e.g. mutex lock / release). So if one thread initializes the object and pointer, the write to memory setting the pointer can become visible to other threads _before_ the writes initializing the object itself, allowing other threads to access a half-initialized object. (Yes, this really does happen on modern CPUs; it is not just theoretical.) – Nemo Jul 08 '11 at 05:56
@Nemo: I had a synchronization lock in place there as a comment. – Alok Save Jul 08 '11 at 05:58
@Als, I'm still confused after the update. What does the first instance check give you that a sync lock around the test-and-set-instance not already provide. The latter guarantees the testing and setting is atomic so it appears the extra check is superfluous. I'm not having a go at you, and my policy is to not downvote "competing" answers (unless they're hideously woeful which this isn't), but I think I may be missing something obvious here. – paxdiablo Jul 08 '11 at 06:01
@Als: I do not think you understand (yet). Your lock is _inside_ the first "if" statement. Again: Suppose one thread initializes the object. But even though it does so before filling in the pointer, the compiler or CPU can _reorder_ its memory writes so that the pointer becomes non-NULL _before_ the object has been initialized (from other threads' point of view). So another thread can then see a non-NULL pointer _outside_ the synchronized region and access the uninitialized object. Yes, really. – Nemo Jul 08 '11 at 06:02
Bottom line: Unless you are intimately familiar with "memory barriers", stick to the rule: _Never_ access shared state, not even just to read it, without holding the appropriate mutex. This stuff is far more subtle than most programmers realize. (That the "double-checked locking pattern" refuses to die is proof.) – Nemo Jul 08 '11 at 06:03
@paxdiablo: The intent is to avoid the (expensive) lock for the common case where the singleton has already been initialized. But once again, this pattern is _not thread-safe_, period. – Nemo Jul 08 '11 at 06:04
@paxdiablo: Two threads can get inside of the if statement concurrently when instance is null. Then, one thread enters the synchronized block to initialize instance, while the other is blocked. When the first thread exits the synchronized block, the waiting thread enters and creates another Singleton object. So when the second thread enters the synchronized block, it should check to see if instance is non-null. – Alok Save Jul 08 '11 at 06:05
@Nemo: Okay, I understand the argument, but the usual singleton implementation is no good as well, So what is the perfect singleton pattern imlpementation that you can think of or have encountered? – Alok Save Jul 08 '11 at 06:13
@paxdiablo: No worries about the downvotes. I do never downvote as a personal policy, but well rep is not everything, knowledge earned is a deal I would accept in return for rep lost anytime. – Alok Save Jul 08 '11 at 06:15
@Als: Mihran's solution above is correct, because it holds the lock across the entire function. (Mutex lock and unlock act as memory barriers; they have to or mutexes would never work at all.) By the way, I only downvote answers that I am certain are objectively wrong :-). – Nemo Jul 08 '11 at 06:33
@Nemo: it is sufficient to test and set m_instance with atomic swap (InterlockedExchange) ? – user396672 Jul 08 '11 at 08:24

paxdiablo · Answer 6 · 2011-07-08T05:46:52.077

Yes, you really do need to do this. If thread 1 checks the instance flag and get swapped out for thread 2, which then executes the entire getInstance() method, thread 1 will then proceed to get another instance.

This is because it's already checked the flag when it was false and will not recheck it just because of a context switch.

If there's the chance that multiple threads may call getInstance() concurrently, you need to protect the operation of checking and setting the flag as an atomic unit.

Of course, you can get away with no protection if you call getInstance from your main thread before any other threads start.

You may also want to ditch the idea of using a separate flag variable as well. You can set the instance to NULL on load and just use that:

DataCenter* DataCenter::getInstance(){
    static DataCenter *m_instance = 0;
    // begin atomic unit
    if(m_instance == 0)
        m_instance = new DataCenter();
    // end atomic unit
    return m_instance;
}

safe way to get a singleton instance from multithreads

6 Answers6

Linked