7

Suppose that I were designing a thread-safe class that wraps an internal collection:

public class ThreadSafeQueue<T>
{
    private readonly Queue<T> _queue = new Queue<T>();

    public void Enqueue(T item)
    {
        lock (_queue)
        {
            _queue.Enqueue(item);
        }
    }

    // ...
}

Based on my other question, the above implementation is buggy, as race hazards may arise when its initialization is performed concurrently with its usage:

ThreadSafeQueue<int> tsqueue = null;

Parallel.Invoke(
    () => tsqueue = new ThreadSafeQueue<int>(),
    () => tsqueue?.Enqueue(5));

The code above is acceptably non-deterministic: the item may or may not be enqueued. However, under the current implementation, it is also broken, and can give rise to unpredictable behaviour, such as throwing IndexOutOfRangeException, NullReferenceException, enqueueing the same item multiple times, or getting stuck in an infinite loop. This occurs since the Enqueue call might run after the new instance has been assigned to the local variable tsqueue, but before the initialization of the internal _queue field completes (or appears to complete).

Per Jon Skeet:

The Java memory model doesn't ensure that the constructor completes before the reference to the new object is assigned to instance. The Java memory model underwent a reworking for version 1.5, but double-check locking is still broken after this without a volatile variable (as in C#).

This race hazard could be resolved by adding a memory barrier to the constructor:

    public ThreadSafeQueue()
    {
        Thread.MemoryBarrier();
    }

Equivalently, it could be more concisely resolved by making the field volatile:

    private volatile readonly Queue<T> _queue = new Queue<T>();

However, the latter is forbidden by the C# compiler:

'Program.ThreadSafeQueue<T>._queue': a field cannot be both volatile and readonly

Given that the above seems like a justifiable use-case for volatile readonly, is this limitation a flaw in the language design?

I'm aware that one could simply remove the readonly, since it does not affect the public interface of the class. However, that is beside the point, since the same could be said for readonly in general. I'm also aware of the existing question “Why readonly and volatile modifiers are mutually exclusive?”; however, that tackled a different issue.

Concrete scenario: This issue seems to affect code in the System.Collections.Concurrent namespace of the .NET Framework Class Library itself. The ConcurrentQueue<T>.Segment nested class has several fields that are only ever assigned within the constructor: m_array, m_state, m_index, and m_source. Of these, only m_index is declared as readonly; the others cannot be – although they should – since they need to be declared as volatile to meet the requirements of thread-safety.

private class Segment
{
    internal volatile T[] m_array;                  // should be readonly too
    internal volatile VolatileBool[] m_state;       // should be readonly too
    private volatile Segment m_next;
    internal readonly long m_index; 
    private volatile int m_low;
    private volatile int m_high; 
    private volatile ConcurrentQueue<T> m_source;   // should be readonly too

    internal Segment(long index, ConcurrentQueue<T> source)
    {
        m_array = new T[SEGMENT_SIZE];              // field only assigned here
        m_state = new VolatileBool[SEGMENT_SIZE];   // field only assigned here
        m_high = -1;
        m_index = index;                            // field only assigned here
        m_source = source;                          // field only assigned here
    }

    internal void Grow()
    {
        // m_index and m_source need to be volatile since race hazards
        // may otherwise arise if this method is called before
        // initialization completes (or appears to complete)
        Segment newSegment = new Segment(m_index + 1, m_source);
        m_next = newSegment;
        m_source.m_tail = m_next;
    }

    // ...
}
Community
  • 1
  • 1
Douglas
  • 53,759
  • 13
  • 140
  • 188
  • 1
    How could this question be anything other than opinion based? Unless you're hoping for Eric Lippert to chime in, I don't see how anyone else could answer this with more than either speculation or opinion? – Heretic Monkey Aug 17 '16 at 18:53
  • 1
    It's not unreasonable to think Eric Lippert might chime in. If he still remembers anything about C# ;) – adv12 Aug 17 '16 at 18:54
  • I would find it hard to see such code actually written in real life. Is there an actual proper case when this would be needed, other than purposefully ugly and nondeterministic code? Or if I'm wrong and people actually do this I'll be happy to stand corrected. – Sami Kuhmonen Aug 17 '16 at 18:55
  • @SamiKuhmonen: I added a concrete example. I assume this issue impacts most of `System.Collections.Concurrent`. – Douglas Aug 17 '16 at 18:56
  • @Douglas even with a concrete example, you are still asking for ***speculation*** as to whether or not the implementation is broken, instead of being a specific design decision (and considering the compiler restricts it, it probably is). There could be compiler implications that we aren't aware of and wouldn't be able to properly answer. It may be an interesting question, but that doesn't make it on topic. – David L Aug 17 '16 at 18:58
  • 3
    This question is not speculative or opinion-based. I am asking for someone familiar with the C# compiler and the .NET memory model to answer whether the above is genuinely a language oversight, or whether there is a reason behind the mutual exclusion that I'm missing. Granted, most people don't have that knowledge, but it's not beyond the realm of public knowledge, especially with the recent open-sourcing of .NET. Unless you mean that only trivial questions should be posted on StackOverflow. – Douglas Aug 17 '16 at 19:02
  • 1
    @Douglas If you have a question for Microsoft, then ask Microsoft, don't ask random people on the street. Given that you know nobody here will be able to answer the question, why ask the question here? And yes, the question *is* opinion based. Whether you consider this a "flaw" is a matter of opinion, not a matter of fact. – Servy Aug 17 '16 at 19:04
  • @Servy: I never said that nobody here will be able to answer. If I spent a few months studying the compiler and the memory model, I'd be able to answer it myself, but I'm hoping that someone else has done so before me. – Douglas Aug 17 '16 at 19:05
  • The only reason to pose a question such as this is to incite a debate (and he has succeeded). I've personally never written code like the example anyway. It doesn't seem prudent to use the object you are trying to protect with your critical section as the locking object that initiates your critical section. I always define an arbitrary locking object used for nothing but locking the critical section and use that. – Kevin Aug 17 '16 at 19:10
  • @Kevin Whether you have a second object is irrelevant; having a second object in no way changes the problem. – Servy Aug 17 '16 at 19:12
  • @Servy, if you use a second object, there is no reason that the _queue object would need to be both readonly and volatile. – Kevin Aug 17 '16 at 19:15
  • @Kevin: `_queue` would still need to be volatile. I locked on `_queue` just to keep my code sample small; it could have been a separate field. – Douglas Aug 17 '16 at 19:17
  • @Douglas, adding a locking object would add exactly 1 line of code to the example (still small). And I never said _queue would not need to be volatile, I said it wouldn't need to be both readonly and volatile. – Kevin Aug 17 '16 at 19:19
  • I, of course, may be saying nonsense, but I personally see some twisted logic in this question. Correct me if I'm wrong, but as far as I remember constructors are not implicitly thread-safe. So the question here is about why constructor is not thread-safe and how to provide such safety. It's quite far from discussion about `volatile` or `readonly`. Sorry, I can't explain it better. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 19:21
  • @Kevin: It should still be readonly too, since it's only ever assigned during the instance's initialization. A field never *needs* to be readonly (not even if you lock on it), but that's beyond the point. – Douglas Aug 17 '16 at 19:22
  • @Sergey.quixoticaxis.Ivanov: I tackled that in my previous question, "[Should thread-safe class have a memory barrier at the end of its constructor?](http://stackoverflow.com/q/38881722/1149773)". A constructor may be made thread-safe with memory barriers. This question discusses one approach for how that may be achieved, using volatile, and why the C# language seemingly forbids it. – Douglas Aug 17 '16 at 19:25
  • @Kevin That's just moving the problem elsewhere. If you create a second object then that second object needs to be both readonly and volatile. It has exactly the same problems that this has. – Servy Aug 17 '16 at 19:31
  • @Douglas my point was that while `private volatile Queue _queue = new Queue();` fixes the problem in real life (thanks to implementation), I think that it does nothing in theory (specification wise). Imaginary `volatile readonly Qeuue _queue = new Queue();` will 1) enforce readonly semantics; 2) make a write to the reference variable thread independent. Imaginary `volatile readonly` will (yet again, as far as I understand) still allow writing the reference before the actual initialization because both `volatile` and `readonly` has no impact on constructor. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 19:31
  • @Sergey.quixoticaxis.Ivanov: volatile will prevent the internal `Queue` from being accessed by other threads before it has been initialized (or appears as initialized on their memory cache). Yes, volatile prevents `NullReferenceException` on the `_queue` field, but it also prevents a slew of other errors internal to the `Queue` class. – Douglas Aug 17 '16 at 19:44
  • @Douglas `volatile` will prevent reading of address in _queue while it's is not written into it. But I can't find some quote from any specification that guarantees that constructor can't write an address of still uninitialized object to this variable. I can imagine a constructor internal implementation that writes the address to the variable first and then does everything else without reading or writing this address. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 19:50
  • @Sergey.quixoticaxis.Ivanov: If your readers are also constrained by a memory barrier (as implicitly generated by volatile or lock), then you're guaranteed that the entire `Queue` will appear as initialized before they may access it. – Douglas Aug 17 '16 at 19:55
  • @Douglas I fail to see why you suppose that writing an address to volatile variable and initializing an instance are equal. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 19:57
  • Imaginary ctor: `void ctor(...) { address = GetSomeMemory(); volatileVariable = address; // now you can read it without violating volatile DoSomeInitializationForAgesAndNotUsingVolatileVariable(); }` – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 19:59
  • 2
    The update with the concrete scenario doesn't quite seem concrete. It doesn't say how it "seems to affect code." Perhaps we could get an example? – Scott Hannen Aug 17 '16 at 20:00
  • @ScottHannen: Added. – Douglas Aug 17 '16 at 20:07
  • @Douglas also I remembered, constructor does not pass anything out, the caller already has the address from operator new. Just some magic text:`MethodName: .ctor (06000002) Flags: [Public] [HideBySig] [ReuseSlot] [SpecialName] [RTSpecialName] [.ctor] (00001886) RVA : 0x0000206b ImplFlags : [IL] [Managed] (00000000) CallCnvntn: [DEFAULT] hasThis ReturnType: Void No arguments.` Ctor gets already existing address as a hidden argument, so it lives in a completely different universe from volatile and readonly imho. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 20:10
  • @Sergey.quixoticaxis.Ivanov: [Albahari](http://www.albahari.com/threading/part4.aspx#_Nonblocking_Synchronization) explains this better than I could: "The volatile keyword instructs the compiler to generate an *acquire-fence* on every read from that field, and a *release-fence* on every write to that field. An acquire-fence prevents other reads/writes from being moved *before* the fence; a release-fence prevents other reads/writes from being moved *after* the fence." – Douglas Aug 17 '16 at 20:12
  • @Douglas I know what `volatile` does) ThreadN1 acquires memory and address. ThreadN1 writes the address. Now ThreadN2 reads from this address. **_No writes or reads were moved at all_**. ThreadN2 has some address to wicked uninitialized object. Why not? `volatile` guarantees were not violated. Now ThreadN1 calls the .ctor reading and passing it an address. **_No writes or reads were moved at all (still)_**. Ctor works. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 20:18
  • @Douglas - you posted the source code, but you had already linked to it, so I saw it. But can you demonstrate a defective behavior? – Scott Hannen Aug 17 '16 at 20:24
  • @ScottHannen: The defect is that the fields should be declared as readonly, but cannot be. As I mentioned in the question, the absence of a readonly does not cause any defective behaviour for consumers, but that's beside the point, since the same reasoning applies to readonly in general. – Douglas Aug 17 '16 at 20:29
  • @Douglas same as with Thread.MemoryBarrier() I can't find a quote in ECMA that enforces calling new (that includes write) being atomic with calling of the constructor. Btw if you know such quote, I would be grateful if you link it. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 20:31
  • @Sergey.quixoticaxis.Ivanov: "Now ThreadN2 reads from this address." That behaviour is prohibited if the address were protected by a volatile. – Douglas Aug 17 '16 at 20:31
  • @Douglas why? It was the last write in ThreadN1. Or actually so when in your opinion the read from volatile variable will be allowed for ThreadN2? – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 20:33
  • @Sergey.quixoticaxis.Ivanov: I think you might be right; volatile doesn't prevent reordering of a write followed by a read. I'll need to look into this again. – Douglas Aug 17 '16 at 20:36
  • @Douglas now I'm also really interested in it. I hope we'll get someone with actual knowledge of how this is implemented to share some thoughts on the matter. – Sergey.quixoticaxis.Ivanov Aug 17 '16 at 20:39
  • Maybe you should ask "What are reasons this is forbidden?" so that the answer can be an objective list instead of subjective (flaw or not - who knows). – usr Aug 17 '16 at 20:57
  • @usr That's no less subjective. – Servy Aug 17 '16 at 21:18
  • @Servy what I posted clearly is a reason. It's not exhaustive but I don't think that is or should be against the site rules. – usr Aug 17 '16 at 21:45
  • @usr The fact that you provided a reason doesn't mean it's not an opinion. I can post reasons why ice cream is delicious, but that doesn't make them not opinions. – Servy Aug 17 '16 at 21:55
  • I do not know why this design decision was made; it was long before my time on the team. – Eric Lippert Sep 08 '16 at 12:10

3 Answers3

4

readonly fields are fully writable from the constructor's body. Indeed, one could use volatile accesses to a readonly field to cause a memory barrier. Your case is, I think, a good case for doing that (and it's prevented by the language).

It is true that writes made inside of the constructor might not be visible to other threads after the ctor has completed. They might become visible in any order even. This is not well known because it rarely comes into play in practice. The end of the constructor is not a memory barrier (as often assumed from intuition).

You can use the following workaround:

class Program
{
    readonly int x;

    public Program()
    {
        Volatile.Write(ref x, 1);
    }
}

I tested that this compiles. I was not sure whether it is allowed to form a ref to a readonly field or not, but it is.

Why does the language prevent readonly volatile? My best guess is that this is to prevent you from making a mistake. Most of the time this would be a mistake. It's like using await inside of lock: Sometimes this is perfectly safe but most of the time it's not.

Maybe this should have been a warning.

Volatile.Write did not exist at the time of C# 1.0 so the case for making this a warning for 1.0 is stronger. Now that there is a workaround the case for this being an error is strong.

I do not know if the CLR disallows readonly volatile. If yes, that could be another reason. The CLR has a style of allowing most operations that are reasonable to implement. C# is far more restrictive than the CLR. So I'm quite sure (without checking) that the CLR allows this.

usr
  • 168,620
  • 35
  • 240
  • 369
  • 1
    That's a good point (and analogy). If the risk of mistakes arising from allowing `readonly volatile` is greater than for those arising from the forced omission of a `readonly`, then the language design feels justified. Although the error should ideally have indicated that some usages are sensible, and suggested the `Volatile.Write` alternative. – Douglas Aug 17 '16 at 21:07
2
ThreadSafeQueue<int> tsqueue = null;

Parallel.Invoke(
    () => tsqueue = new ThreadSafeQueue<int>(),
    () => tsqueue?.Enqueue(5));

In your example the issue is that tsqueue is published in non-thread safe manner. And in this scenario it's absolutely possible to get a partially constructed object on architectures like ARM. So, mark tsqueue as volatile or assign the value with Volatile.Write method.

This issue seems to affect code in the System.Collections.Concurrent namespace of the .NET Framework Class Library itself. The ConcurrentQueue.Segment nested class has several fields that are only ever assigned within the constructor: m_array, m_state, m_index, and m_source. Of these, only m_index is declared as readonly; the others cannot be – although they should – since they need to be declared as volatile to meet the requirements of thread-safety.

Marking fields as readonly just adds some constraints that compiler checks and that JIT might later use for optimizations (but JIT is smart enough to figure out that field is readonly even without that keyword in some cases). But marking these particular fields volatile is much more important because of concurrency. private and internal fields are under control of the author of that library, so it's absolutely ok to omit readonly there.

Valery Petrov
  • 653
  • 7
  • 19
1

First of all, it seems to be a limitation imposed by the language, not the platform:

.field private initonly class SomeTypeDescription modreq ([mscorlib]System.Runtime.CompilerServices.IsVolatile) SomeFieldName    

compiles fine, and I couldn't find any quote stating that initonly (readonly) can not be paired with modreq ([mscorlib]System.Runtime.CompilerServices.IsVolatile) (volatile).

As far as I understand, the described situation probably comes from low level instruction swapping. The code that constructs an object and places it into a field looks like:

newobj       instance void SomeClassDescription::.ctor()   
stfld        SomeFieldDescription

And as ECMA states:

The newobj instruction allocates a new instance of the class associated with ctor and initializes all the fields in the new instance to 0 (of the proper type) or null as appropriate. It then calls the constructor with the given arguments along with the newly created instance. After the constructor has been called, the now initialized object reference is pushed on the stack.

So, as far as I understand, until the instructions are not swapped (which is imho possible because returning of the address of created object and filling this object are stores to different locations), you always see either fully initialized object or null when reading from another thread. That can be guaranteed by using volatile. It would prevent swapping:

newobj
volatile.
stfld

P.s. It's not an answer in itself. I don't know why C# forbids readonly volatile.