12

I know that I can create an immutable (i.e. thread-safe) object like this:

class CantChangeThis
{
    private readonly int value;

    public CantChangeThis(int value)
    {
        this.value = value;
    }

    public int Value { get { return this.value; } } 
}

However, I typically "cheat" and do this:

class CantChangeThis
{
    public CantChangeThis(int value)
    {
        this.Value = value;
    }

    public int Value { get; private set; } 
}

Then I got wondering, "why does this work?" Is it really thread-safe? If I use it like this:

var instance = new CantChangeThis(5);
ThreadPool.QueueUserWorkItem(() => doStuff(instance));

Then what it's really doing is (I think):

  1. Allocating space on the thread-shared heap for the instance
  2. Initializing the value inside the instance on the heap
  3. Writing a pointer/reference to that space into the local variable (thread-specific stack)
  4. Passing the reference to that thread as a value. (Interestingly the way I've written it, the reference is inside a closure, which is doing the same thing that my instance is doing, but let's ignore that.)
  5. Thread goes to the heap and reads data from the instance.

However, that instance value is stored in shared memory. The two threads might have cache-inconsistent views of that memory on the heap. What is it that makes sure the threadpool thread actually sees the constructed instance and not some garbage data? Is there an implicit memory barrier at the end of any object construction?

Scott Whitlock
  • 13,739
  • 7
  • 65
  • 114
  • I think [this discussion](http://bytes.com/topic/c-sharp/answers/277909-does-threadpool-utilize-memory-barrier) involving Jon Skeet does touch on the same issue I'm thinking about. – Scott Whitlock May 29 '15 at 13:58
  • What is a thread-shared heap? Ain't all heaps thread-shared? – Thomas Weller May 29 '15 at 14:12
  • @Thomas - I suppose `BinaryHeap` doesn't have to be shared if you don't want it to be. :) I was just emphasizing the fact that the heap is shared by all threads, not trying to single out one-of-many heaps. – Scott Whitlock May 29 '15 at 14:14
  • Ok, for me, `BinaryHeap` is a class, not a real heap, even if the name suggests that. I could also create a class called `Register` but it will not be a CPU register then... – Thomas Weller May 29 '15 at 14:21
  • @Thomas - yes, I was kidding – Scott Whitlock May 29 '15 at 15:02

2 Answers2

10
  • Writing a pointer/reference to that space into the local variable (thread-specific stack)
  • Initializing the value inside the instance on the heap

No... invert them. It is more similar to:

  • memory for the object is allocated
  • the constructor(s) is (are, base classes) called
  • the reference to the memory/object is "returned" from the new operator/keyword,
  • the reference is "saved" in the var instance (= assignment operator)

You can check this by throwing an exception in the constructor. The reference variable won't be assigned.

In general, you don't want another thread being able to see semi-initialized object (note that in the first version of Java this wasn't guaranteed... Java 1.0 had what is called a "weak" memory model). How is this obtained?

On Intel it is guaranteed:

The x86-x64 processor will not reorder two writes, nor will it reorder two reads.

This is quite important :-) and it guarantees that that problem won't happen. This guarantee isn't part of .NET or of ECMA C# but on Intel it is guaranteed from the processor, and on Itanium (an architecture without that guarantee), this was done by the JIT compiler (see same link). It seems that on ARM this isn't guaranteed (still same link). But I haven't seen anyone speaking of it.

in general, in the example give, this isn't important, because:

Nearly all the operations that relate to threads use full Memory Barrier (see Memory barrier generators). A full Memory Barrier guarantees that all write and read operations that are before the barrier are really executed before the barrier, and all the read/write operations that are after the barrier are executed after the barrier. The ThreadPool.QueueUserWorkItem surely at a certain point uses one full Memory Barrier. And the starting thread must clearly start "fresh", so it can't have stale data (and by https://stackoverflow.com/a/10673256/613130, I'd say it is safe to assume you can rely on the implicit barrier.)

Note that Intel processors are naturally cache coherent... You have to disable cache coherency manually if you don't want it (see for example this question: https://software.intel.com/en-us/forums/topic/278286), so the only possible problems would be of a variable that is "cached" in a register or of a read that is anticipated or a write that is delayed (and both these "problems" are "fixed" by the use of full Memory Barrier)

addendum

Your two pieces of code are equivalent. Auto properties are simply an "hidden" field plus a boilerplate get/set that are respectively return hiddenfield; and hiddenfield = value. So if there was problem with v2 of the code, there would be the same problem with v1 of the code :-)

Community
  • 1
  • 1
xanatos
  • 109,618
  • 12
  • 197
  • 280
  • Thank you, this makes a lot of sense. I guess in general that means I could actually instantiate an object, mutate it (same thread) and then pass it safely to a thread pool thread as long as the original thread never mutates it again. This is a more lenient interpretation than I had been thinking of in my head. – Scott Whitlock May 29 '15 at 14:17
  • @ScottWhitlock Yep. Until you queue the thread or the thread start (because you are creating manually with `Thread`), you can do whatever you want with the object. Note that even `lock` and many synchronization classes (probably nearly all) cause an implicit memory barrier... And I do really hope that concurrent collections do the same (otherwise they would be totally useless :-) ) – xanatos May 29 '15 at 14:19
0

Provided nothing circumvents the language-level blocks to invoke the setter (which can be done with reflection), then your object will remain immutable and thread-safe, just as it would if you used a read-only field.

Regarding shared memory and cache-inconsistent views, those are details that get handled by the framework, the operating system, and your hardware, so you don't need to worry about them when programming something high-level like this.

StriplingWarrior
  • 151,543
  • 27
  • 246
  • 315
  • 2
    Given that the ECMA spec doesn't seem to guarantee what's going on, I do think I need to worry about how it actually works, at least once. I can be a better programmer if I understand this stuff. – Scott Whitlock May 29 '15 at 14:08
  • @ScottWhitlock: Don't get me wrong, it's certainly awesome that you're asking questions and wanting to understand what's going on behind the scenes. Just remember that good programming involves leveraging the abstractions that are available to you, to quickly produce maintainable code. If you get nerd-sniped, worrying about things like the performance details of variable declarations, or which things are going on the stack vs the heap vs CPU registers, that won't help you be a better programmer 99% of the time. – StriplingWarrior May 29 '15 at 14:14