30

When implementing a class intended to be thread-safe, should I include a memory barrier at the end of its constructor, in order to ensure that any internal structures have completed being initialized before they can be accessed? Or is it the responsibility of the consumer to insert the memory barrier before making the instance available to other threads?

Simplified question:

Is there a race hazard in the code below that could give erroneous behaviour due to the lack of a memory barrier between the initialization and the access of the thread-safe class? Or should the thread-safe class itself protect against this?

ConcurrentQueue<int> queue = null;

Parallel.Invoke(
    () => queue = new ConcurrentQueue<int>(),
    () => queue?.Enqueue(5));

Note that it is acceptable for the program to enqueue nothing, as would happen if the second delegate executes before the first. (The null-conditional operator ?. protects against a NullReferenceException here.) However, it should not be acceptable for the program to throw an IndexOutOfRangeException, NullReferenceException, enqueue 5 multiple times, get stuck in an infinite loop, or do any of the other weird things caused by race hazards on internal structures.

Elaborated question:

Concretely, imagine that I were implementing a simple thread-safe wrapper for a queue. (I'm aware that .NET already provides ConcurrentQueue<T>; this is just an example.) I could write:

public class ThreadSafeQueue<T>
{
    private readonly Queue<T> _queue;

    public ThreadSafeQueue()
    {
        _queue = new Queue<T>();

        // Thread.MemoryBarrier(); // Is this line required?
    }

    public void Enqueue(T item)
    {
        lock (_queue)
        {
            _queue.Enqueue(item);
        }
    }

    public bool TryDequeue(out T item)
    {
        lock (_queue)
        {
            if (_queue.Count == 0)
            {
                item = default(T);
                return false;
            }

            item = _queue.Dequeue();
            return true;
        }
    }
}

This implementation is thread-safe, once initialized. However, if the initialization itself is raced by another consumer thread, then race hazards could arise, whereby the latter thread would access the instance before the internal Queue<T> has been initialized. As a contrived example:

ThreadSafeQueue<int> queue = null;

Parallel.For(0, 10000, i =>
{
    if (i == 0)
        queue = new ThreadSafeQueue<int>();
    else if (i % 2 == 0)
        queue?.Enqueue(i);
    else
    {
        int item = -1;
        if (queue?.TryDequeue(out item) == true)
            Console.WriteLine(item);
    }
});

It is acceptable for the code above to miss some numbers; however, without the memory barrier, it could also be getting a NullReferenceException (or some other weird result) due to the internal Queue<T> not having been initialized by the time that Enqueue or TryDequeue are called.

Is it the responsibility of the thread-safe class to include a memory barrier at the end of its constructor, or is it the consumer who should include a memory barrier between the class's instantiation and its visibility to other threads? What is the convention in the .NET Framework for classes marked as thread-safe?

Edit: This is an advanced threading topic, so I understand the confusion in some of the comments. An instance can appear as half-baked if accessed from other threads without proper synchronization. This topic is discussed extensively within the context of double-checked locking, which is broken under the ECMA CLI specification without the use of memory barriers (such as through volatile). Per Jon Skeet:

The Java memory model doesn't ensure that the constructor completes before the reference to the new object is assigned to instance. The Java memory model underwent a reworking for version 1.5, but double-check locking is still broken after this without a volatile variable (as in C#).

Without any memory barriers, it's broken in the ECMA CLI specification too. It's possible that under the .NET 2.0 memory model (which is stronger than the ECMA spec) it's safe, but I'd rather not rely on those stronger semantics, especially if there's any doubt as to the safety.

Community
  • 1
  • 1
Douglas
  • 53,759
  • 13
  • 140
  • 188
  • 2
    The source code for `ConcurrentQueue` that you mentioned doesn't have any protection in its constructor. Make of that what you will. http://referencesource.microsoft.com/#mscorlib/system/Collections/Concurrent/ConcurrentQueue.cs,18bcbcbdddbcfdcb – Bradley Uffner Aug 10 '16 at 19:17
  • How about the consumer initializing using Lazy which makes the initialization thread-safe ? :) – Zein Makki Aug 10 '16 at 19:24
  • 1
    Barring a constructor that actually has async calls within it, can a reference even be set to reference an instance before the instance is constructed? – Uueerdo Aug 10 '16 at 19:25
  • @Uueerdo As can be observed from a single thread, no. As can be observed from multiple threads, yes. – Servy Aug 10 '16 at 19:26
  • @BradleyUffner: Good point. That makes for a clearer question. – Douglas Aug 10 '16 at 19:26
  • How exactly the caller can make the instance accessible to other threads before the instance is constructed, since the caller itself does not have access to the instance yet?. The only way I see making instance accessible before being constructed is if the constructor calls some external code passing `this`. – Ivan Stoev Aug 10 '16 at 19:27
  • @user3185569: That would be a good solution if a race hazard exists. My question is whether the `Lazy` is necessary, or whether the thread-safe class should provide that protection itself. – Douglas Aug 10 '16 at 19:27
  • I think this may only be a problem with `static` constructors (or constructors that share `static` state). Instance constructors generally seem safe. – Bradley Uffner Aug 10 '16 at 19:30
  • @Douglas my interpretation has always been that the resource is thread-safe but I don't have the resource until after it is initialized so it's the callers job to protect the initialization – HasaniH Aug 10 '16 at 19:31
  • 1
    @IvanStoev In a single threaded context, yes, in a multithreaded context, you can observe the orders of operations to be different than the guarantees are for a single threaded program. Your CPU is allowed to reorder different writes to entirely different values that don't depend on each other. – Servy Aug 10 '16 at 19:49
  • @Servy I know that in general, but could you please provide a link or something that covers the constructors? All I see is *It is explicitly not a requirement that a conforming implementation of the CLI guarantee that all state updates performed within a constructor be uniformly visible before the constructor completes. CIL generators can ensure this requirement themselves by inserting appropriate calls to the memory barrier or volatile write instructions*. But this is speaking about **before the constructor completes**. – Ivan Stoev Aug 10 '16 at 19:57
  • @IvanStoev The specs state what *is* required; they don't provide a comprehensive list of all of the things that they *aren't* required. The specs aren't going to say that a constructor is allowed to return the object before the constructor has finished, rather the value a constructor returns isn't specifically on the list of operations that different threads are guaranteed to observe in a logically consistent order to a single threaded program. – Servy Aug 10 '16 at 20:00
  • @Servy so you are saying the assignment can occur after allocation, but before construction/initialization, even when all those occur on the same thread? – Uueerdo Aug 10 '16 at 20:01
  • @IvanStoev: Given the statement `_queue = new Queue();`, the compiler / architecture / memory model may cause the memory location of the new object to appear to be assigned to `_queue` *before* the `Queue()` constructor completes. – Douglas Aug 10 '16 at 20:01
  • 2
    @Uueerdo A single threaded program could not observe those actions happening out of order. Another thread observing the actions performed on another has *radically* fewer constraints on the observed ordering of operations. As it pertains to this example, a second thread can in fact observe an constructor call from another thread returning the instance before the constructor itself has finished running. The thread that called the constructor can't observe that unusual ordering, but any other thread can. – Servy Aug 10 '16 at 20:05
  • 1
    @Douglas I upvoted your question because I find it interesting. But if a high level language like C# can't provide such simple guarantee, then I don't see what we are doing here. I quit programming :) – Ivan Stoev Aug 10 '16 at 20:10
  • @IvanStoev: You're far from alone in your frustration. The overwhelming consensus is that the majority of programmers should avoid low-level synchronization techniques, and just rely on `lock` or higher-level frameworks such as TPL. – Douglas Aug 10 '16 at 20:25
  • @Theodor Zoulias On x86, stores can't be reordered regarding each other, so if you see the result of the constructor then you're also seeing the value of `_queue`. On ARM64, all bets are off. I'm *guessing* the JIT compiler will insert a barrier to avoid breaking existing program, but unfortunately I don't know how to check (I don't have an ARM64 cpu, and sharplab only supports x86) – Kevin Gosse Jan 03 '21 at 10:13

4 Answers4

4

Lazy<T> is a very good choice for Thread-Safe Initialization. I think it should be left to the consumer to provide that:

var queue = new Lazy<ThreadSafeQueue<int>>(() => new ThreadSafeQueue<int>());

Parallel.For(0, 10000, i =>
{

    else if (i % 2 == 0)
        queue.Value.Enqueue(i);
    else
    {
        int item = -1;
        if (queue.Value.TryDequeue(out item) == true)
            Console.WriteLine(item);
    }
});
Zein Makki
  • 29,485
  • 6
  • 52
  • 63
  • 1
    +1: This is what I currently do, given my uncertainty. But I would like to know whether the code should be expected to work *without* the thread synchronization from the consumer. In other words: If I were the one implementing the thread-safe class, should I be able to call my class "thread-safe" if consumers still needed to use `Lazy`? – Douglas Aug 10 '16 at 19:29
  • 1
    I don't understand the edit. My original code would never have executed the initialization twice either. – Douglas Aug 10 '16 at 19:45
  • @Douglas Have you actually got `null` queue when running this code ? At least once ? – Zein Makki Aug 10 '16 at 19:52
  • @user3185569 The code can never print `null`, but it can drop items on the floor if they're processed before the queue has been initialized. You're assuming that the 0th iteration completes before any of the other iterations start. There is no such guarentee. – Servy Aug 10 '16 at 19:54
  • 2
    @user3185569: I might have chosen a bad example. It is acceptable for the code to drop some numbers. It is not acceptable for the code to throw `NullReferenceException`, `IndexOutOfRangeException`, print repeated numbers, get stuck in infinite loops, or do any of the other weird things caused by race hazards on internal structures. – Douglas Aug 10 '16 at 19:56
0

I'll attempt to answer this interesting and well-presented question, based on the comments by Servy and Douglas, and on information coming from other related questions. What follows is just my assumptions, and not solid information from a reputable source.

  1. Thread-safe classes have properties and methods that can be safely invoked by multiple threads concurrently, but their constructors are not thread-safe. This means that it is entirely possible for a thread to "see" an instance of a thread-safe class having an invalid state, provided that the instance is constructed concurrently by another thread.

  2. Adding the line Thread.MemoryBarrier(); at the end of the constructor is not enough to make the constructor thread-safe, because this statement only affects the thread that runs the constructor¹. The other threads that may access concurrently the under-construction instance are not affected. Memory-visibility is cooperative, and one thread cannot change what another thread "sees" by altering the other thread's execution flow (or invalidating the local cache of the CPU-core that the other thread is running on) in a non-cooperative manner.

  3. The correct and robust way to ensure that all threads are seeing the instance having a valid state, is to include proper memory barriers in all threads. This can be achieved by either declaring the instance as volatile, in case it is a field of a class, or otherwise using the methods of the static Volatile class:

ThreadSafeQueue<int> queue = null;

Parallel.For(0, 10000, i =>
{
    if (i == 0)
        Volatile.Write(ref queue, new ThreadSafeQueue<int>());
    else if (i % 2 == 0)
        Volatile.Read(ref queue)?.Enqueue(i);
    else
    {
        int item = -1;
        if (Volatile.Read(ref queue)?.TryDequeue(out item) == true)
            Console.WriteLine(item);
    }
});

In this particular example it would be simpler and more efficient to instantiate the queue variable before invoking the Parallel.For method. Doing so would render unnecessary the explicit Volatile invocations. The Parallel.For method is using Tasks internally, and TPL includes the appropriate memory barriers at the beginning/end of each task. Memory barriers are generated implicitly and automatically by the .NET infrastructure, by any built-in mechanism that starts a thread or causes a delegate to execute on another thread. (citation)

I'll repeat that I'm not 100% confident about the correctness of the information presented above.

¹ Quoting from the documentation of the Thread.MemoryBarrier method: Synchronizes memory access as follows: The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to MemoryBarrier() execute after memory accesses that follow the call to MemoryBarrier().

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • "Adding the line `Thread.MemoryBarrier();` at the end of the constructor is not enough to make the constructor thread-safe." <- That isn't necessarily true for thread-safe classes like the one I presented. Yes, memory-visibility is cooperative, but all the other methods of the class already ensure this through their `lock` statements, which generate an implicit memory barrier. – Douglas Jan 04 '21 at 17:44
  • @Douglas your `ThreadSafeQueue` class creates a `Queue` in its constructor. The constructor of the `Queue` class [doesn't do much](https://referencesource.microsoft.com/system/compmod/system/collections/generic/queue.cs.html#a09109def640b7b1). It just assigns the static field `T[] _emptyArray` as the value of the private field `T[] _array`. But how confident you are that another thread that is handed a reference to a newly constructed `ThreadSafeQueue`, will not see the `_array` having its original `null` value, causing a `NullReferenceException`? – Theodor Zoulias Jan 04 '21 at 18:09
  • The only way of accessing `_array` is through the `Enqueue` or `TryDequeue` methods, both of which take a `lock` (and hence an implicit memory barrier) before the `_array` instance is accessed. That said, I'm not sure whether the implicit barrier protects against the `x` in `lock(x)` still being `null`, so you have a point there if that's what you meant. – Douglas Jan 04 '21 at 21:44
  • @Douglas good point about the memory barrier inserted implicitly by the `lock`. I hadn't thought about it. It may be the ingredient that allows safe access to a non-volatile variable/field that stores instances of your class. Or may not. Hopefully some expert that has studied the CLR/C# specifications and knows about this stuff, will be able to enlighten us! – Theodor Zoulias Jan 04 '21 at 21:59
  • I've been studying this topic for a while, and the conclusion I've arrived at is that the memory model in the ECMA CLI spec is broken. However, the CLR and all other mainstream implementations of the runtime offer a stronger memory model than what is required by the spec, thus avoiding these pitfalls. – Douglas Jan 04 '21 at 22:09
  • @Douglas not happy to hear about this conclusion! Personally I've adopted the following advice by Igor Ostrovsky: *"All code you write should rely only on the guarantees made by the ECMA C# specification, and not on any of the implementation details explained in this article."* From [this](https://learn.microsoft.com/en-us/archive/msdn-magazine/2012/december/csharp-the-csharp-memory-model-in-theory-and-practice) article. But if the spec is broken, I may have to reconsider. – Theodor Zoulias Jan 04 '21 at 22:54
  • I haven't read Igor, but many world experts concur that the CLI model is too weak, vague, or outright broken. See [Jon Skeet](https://codeblog.jonskeet.uk/2006/11/26/the-cli-memory-model-and-specific-specifications/) and [Joe Duffy](http://joeduffyblog.com/2010/12/04/sayonara-volatile/) for two examples. – Douglas Jan 07 '21 at 18:28
  • 1
    One more confirmation that the official CLR implementation protects against this kind of stuff, even on ARM64: https://github.com/dotnet/runtime/issues/46911#issuecomment-760004625 – Kevin Gosse Jan 14 '21 at 13:38
0

Should thread-safe class have a memory barrier at the end of its constructor?

I do not see a reason for this. The queue is local variable that is assigned from one thread and accessed from another. Such concurrent access should be synchronized and it is responsibility of the accessing code to do so. It has nothing to do with constructor or type of the variable, such access should always be explicitly synchronized or you are entering a dangerous area even for primitive types (even if the assignment is atomic, you may get caught is some cache trap). If the access to the variable is properly synchronized, it does not need any support in the constructor.

Antonín Lejsek
  • 6,003
  • 2
  • 16
  • 18
  • 1
    Without saying that this answer is incorrect, I feel that it ventures outside the context of the asked question. This question is about classes, not primitive types (which are usually structs). In this context the [atomicity of assignment](https://stackoverflow.com/questions/5209623/is-a-reference-assignment-threadsafe) should be considered a fact, not something iffy. – Theodor Zoulias Jan 01 '21 at 19:07
-1

No, you don't need memory barrier in the constructor. Your assumption, even though demonstrating some creative thought - is wrong. No thread can get a half backed instance of queue. The new reference is "visible" to the other threads only when the initialization is done. Suppose thread_1 is the first thread to initialize queue - it goes through the ctor code, but queue's reference in the main stack is still null! only when thread_1 exists the constructor code it assigns the reference.

See comments below and OP elaborated question.

shay__
  • 3,815
  • 17
  • 34
  • 1
    Unfortunately, I think you're missing the intricacies of the ECMA CLI memory model. You *can* get a half-baked instance of `queue` visible to other threads. – Douglas Aug 10 '16 at 19:39
  • 1
    This answer is unfortunately basically wishful thinking. It assumes synchronisation when there is none. –  Aug 10 '16 at 19:43
  • 2
    @Douglas I have to admit I did not think about it. Yet, you don't find any evidence of memory barriers in `System.Collections.Concurrent` classes ctors which are **by definition** thread safe. Your topic extends thread safety definition to new regions. And that's pretty cool :) – shay__ Aug 10 '16 at 20:00
  • 1
    `System.Collections.Concurrent classes ctors which are by definition thread safe` That's not true. They're not safe *by definition*. They're just implementations of types, like any other. They're much less likely to have mistakes than code you or I write, but it's not impossible for them to have mistakes; their code is not correct *by definition*, but rather there is a high degree of confidence in its correctness. – Servy Aug 10 '16 at 20:02
  • 1
    It's always good to be reminded that there is much more you don't know than what you think you don't know :) – shay__ Aug 10 '16 at 20:09
  • 1
    @Servy: I assume that shay__ meant that the `System.Collections.Concurrent` classes provide the *de facto* standard for what "thread-safe", as a term, should entail. Is there a formal definition / wider consensus for "thread-safe" which covers whether it should include initialization or not? – Douglas Aug 10 '16 at 20:09
  • @Douglas "Thread Safe" is a pretty meaningless term in general. Probably the best definition you'll manage to get is to say that a program that uses multiple threads does what it's supposed to do. You can't really give any more meaningful definition to the term, and as such, the term tends to be largely useless. If you have an object you plan to access from multiple threads you need to specifically define the operations you plan to perform, and the acceptable and unacceptable behaviors of those operations. – Servy Aug 10 '16 at 20:13
  • 1
    @Servy: I agree that it's loosely defined. But it appears regularly on MSDN: "All public and protected members of [`ConcurrentQueue`](https://msdn.microsoft.com/en-us/library/dd267265(v=vs.110).aspx#Anchor_10) are thread-safe and may be used concurrently from multiple threads." Given, as others have pointed out, that `ConcurrentQueue` doesn't include a memory barrier at the end of its constructor, then either the documentation or the implementation is incorrect. – Douglas Aug 10 '16 at 20:17
  • @Douglas The fact that the documentation uses an ill-defined term means that the statement itself is simply meaningless. The assertion that, "All public and protected members of ConcurrentQueue are thread-safe" doesn't actually tell you anything about the type. It's not *incorrect*, it's just not actually making any real assertion to begin with, right or wrong. It's like if it would have asserted that the type is super cool. That statement isn't wrong; it's not a statement that can either be considered objective correct or incorrect. – Servy Aug 10 '16 at 20:20
  • 1
    @Servy: It's the "may be used concurrently from multiple threads" part that's incorrect. The constructor (a public member) may not be used concurrently with other public members. – Douglas Aug 10 '16 at 20:27
  • @Douglas Again, you need to be much more specific for the statements to have meaning. You would need to define specific usages of the type, along with the expected behavior, before you could say whether or not it satisfies those constraints. You're also assuming that just because you have an object instance of the constructor even though the constructor hasn't finished, that that is a problem. – Servy Aug 10 '16 at 20:34
  • 1
    As long as the queue functions appropriately even if you have an instance who's constructor hasn't finished, then that's not a problem. If you can't observe any operation being performed on the queue happening before the construction has finished, which is likely the case given how those other operations are implemented (namely the use of volatile fields), there is no issue. – Servy Aug 10 '16 at 20:34
  • @Servy: Any further specificity would defeat the purpose of the notion of thread-safety. I understand what you're saying re usages and expected behaviour, but corruption of internal state should never be an acceptable outcome for thread-safe collections. If the public members of `ConcurrentQueue` indeed function correctly even before the constructor completes, then that's fine. But it also imposes the expectation that my `ThreadSafeQueue` should also use `volatile` or memory barriers in its constructor to conform with the .NET convention for thread-safety. – Douglas Aug 10 '16 at 20:52
  • @Douglas `Any further specificity would defeat the purpose of the notion of thread-safety.` That's quite true. I thought I was pretty clear in saying it's a pretty useless term, on its own, and doesn't actually tell you anything useful. It's not about .NET conventions for thread safety, it's about what you want to do with that type and whether or not it satisfies that. Don't try to meet a completely undefined definition of "thread safe". Write a class that meets the contract you need it to meet (right now it looks like it fails to do so). – Servy Aug 10 '16 at 21:05
  • 1
    @Servy: I wouldn't agree that the term "thread-safe" is useless if it at least guarantees that concurrent use won't corrupt the internal state or give rise to patently incorrect behaviour (such as a single `Enqueue` call adding the same item twice). Yes, as you're saying, in most cases, one needs more specificity than that (e.g. whether enumerations are moment-in-time or may include subsequent updates). But it would be quite a shortcoming on the .NET team's part if the term is as undefined as you're implying. – Douglas Aug 10 '16 at 21:15
  • @Douglas You've just defined, "thread safe" as "behaves correctly". (That is more or less what I defined it as above.) That's just not a particularly useful thing to say, if you have no idea what the correct behavior is, or should be, or needs to be. It's the morale equivalent of begging the question; using that term has adding nothing useful to the conversation that you didn't have without it. – Servy Aug 10 '16 at 21:23
  • The specs define the minimum memory model that should be supported, but it does not mean that a stronger memory model cannot be implemented. Maybe those `System.Collections.Concurrent` classes just assume a memory model that is stronger than the specified because they only have know target CLR implementations. Memory models are also CPU bound. For example, desktop CPUs are unlikely to have that issue... but, maybe that is not the case anymore since I'm quite dated at CPU architectures. – Miguel Angelo Feb 27 '17 at 16:52
  • 1
    Also note that a memory barrier is not needed past the start or end of a `lock` code block. No reads or writes are allowed to be reordered past lock block boundaries... that means that if you find `lock (obj) { }` you can assume no writes or reads will move from before/after the whole block to the other side, or from inside/outside to the other side. – Miguel Angelo Feb 27 '17 at 16:54