Can memory reordering cause C# to access unallocated memory?

Question

It is my understanding that C# is a safe language and doesn't allow one to access unallocated memory, other than through the unsafe keyword. However, its memory model allows reordering when there is unsynchronized access between threads. This leads to race hazards where references to new instances appear to be available to racing threads before the instances have been fully initialized, and is a widely known problem for double-checked locking. Chris Brumme (from the CLR team) explains this in their Memory Model article:

Consider the standard double-locking protocol:

if (a == null)
{
    lock(obj)
    {
        if (a == null) 
            a = new A();
    }
}

This is a common technique for avoiding a lock on the read of ‘a’ in the typical case. It works just fine on X86. But it would be broken by a legal but weak implementation of the ECMA CLI spec. It’s true that, according to the ECMA spec, acquiring a lock has acquire semantics and releasing a lock has release semantics.

However, we have to assume that a series of stores have taken place during construction of ‘a’. Those stores can be arbitrarily reordered, including the possibility of delaying them until after the publishing store which assigns the new object to ‘a’. At that point, there is a small window before the store.release implied by leaving the lock. Inside that window, other CPUs can navigate through the reference ‘a’ and see a partially constructed instance.

I've always been confused by what "partially constructed instance" means. Assuming that the .NET runtime clears out memory on allocation rather than garbage collection (discussion), does this mean that the other thread might read memory that still contains data from garbage-collected objects (like what happens in unsafe languages)?

Consider the following concrete example:

byte[] buffer = new byte[2];

Parallel.Invoke(
    () => buffer = new byte[4],
    () => Console.WriteLine(BitConverter.ToString(buffer)));

The above has a race condition; the output would be either 00-00 or 00-00-00-00. However, is it possible that the second thread reads the new reference to buffer before the array's memory has been initialized to 0, and outputs some other arbitrary string instead?

[relevant](http://joeduffyblog.com/2010/06/27/on-partiallyconstructed-objects/) on partially constructed instances; and of course [this one](https://stackoverflow.com/a/8359205/1132334), second quote — Cee McSharpface, Jul 04 '18 at 21:24
Yes. But do note that those other threads will have to ignore the lock. I wouldn't call this a big practical problem, it will only hurt 'lock free' code. And when you think you're up to that, you can handle this little issue. And it's all about a hypotethical platform. — H H, Jul 04 '18 at 21:31
Thanks @dlatikay; those are very relevant links. Brian Gideon's answer implies that garbage data *could* be printed for the example in that question, which is sufficiently similar to mine. — Douglas, Jul 04 '18 at 21:33
@HenkHolterman: Thanks for the confirmation. Yes, I realize this will only happen on code that lacks the appropriate synchronization or memory barriers. On a practical level, I'm more concerned about the security implications of it. If code were to be run on an architecture with a weak memory model, bugs such as the above could cause sensitive data to be leaked through access to unallocated memory, which is something I thought wasn't possible in .NET (excluding `unsafe`). — Douglas, Jul 04 '18 at 21:37
"unallocated" but still from the same process. This won't let you spy on your neighbours. — H H, Jul 04 '18 at 21:39
And I retract my "Yes". .NET clears out memory on allocation, so I think "partially constructed" will only show you 0/null (default) values, not the previous occupant of the bytes. Which makes the null-reference exception about the worst case scenario here. — H H, Jul 04 '18 at 21:41
That would make it "safe", but I don't understand how it's achieved. Does the ECMA spec (which I admittedly haven't read) mandate a release fence after object allocation? If not, there's still no guarantee that the stores corresponding to the "clearing out" would not be reordered with respect to the storing of the new reference. — Douglas, Jul 04 '18 at 21:48
Well, the clearing can't be reordered to after the first of those 'other stores to a'. Something must prevent that at least. — H H, Jul 04 '18 at 22:01
[this article](https://msdn.microsoft.com/en-us/magazine/jj883956.aspx) suggests that ECMA does not mandate a release fence. [ecma CLI F.4.1](http://standards.iso.org/ittf/PubliclyAvailableStandards/c042927_ISO_IEC_23271_2006(E).zip) is vague. CLR did use release fences by means of a ST.REL(ease) instead of a simple ST back when Itanium was a .NET target platform, which was the only hardware with a weak memory model architecture actually supported until now. — Cee McSharpface, Jul 04 '18 at 22:09
Thanks @dlatikay. These discussions used to be largely academic since very few of us were actually on IA-64. However, I understand that .NET Core can now run on ARM architectures, which also have a weak memory model, no? — Douglas, Jul 05 '18 at 17:52
Yes, he's saying that memory content might not look initialized from another thread on such a processor. The IA64 gave Microsofties a very hard time, I remember reading that they punted by making every memory read acquire and every write a release. Well, that's why the jitter is discontinued and the processor beyond life-support. They did made changes in the memory model for the ARM jitter, not otherwise documented beyond "we worked on it". It was their job to just make it work, they seemed to have done a good job since I haven't yet seen a question about it on SO. — Hans Passant, Jul 08 '18 at 11:48
I should note that ARM is the only processor design that implemented the C++11 memory model in hardware. They are pretty doggone good at keeping programmers happy, a winning strategy that IA64 missed so badly. — Hans Passant, Jul 08 '18 at 11:51
Thanks @HansPassant for that helpful insight. It's good to know that these issues have been addressed for the ARM jitter. — Douglas, Jul 08 '18 at 12:21
The remark doesn't even have anything to do with locking, i.e. any code doing `field = new Something()` where `field` is not volatile could fail on a weak implementation, unless implemented as `var tmp = new Something(); Thread.MemoryBarrier(); field = tmp;` — vgru, Jul 12 '18 at 09:20
@Groo: Agreed. That article was presenting one case where the absence of a memory barrier would case issues. There are several more. — Douglas, Jul 12 '18 at 17:22

Eric Lippert · Accepted Answer · 2018-07-24T17:17:23.997

Let's not bury the lede here: the answer to your question is no, you will never observe the pre-allocated state of memory in the CLR 2.0 memory model.

I'll now address a couple of your non-central points.

It is my understanding that C# is a safe language and doesn't allow one to access unallocated memory, other than through the unsafe keyword.

That is more or less correct. There are some mechanisms by which one can access bogus memory without using unsafe -- via unmanaged code, obviously, or by abusing structure layout. But in general, yes, C# is memory safe.

However, its memory model allows reordering when there is unsynchronized access between threads.

Again, that's more or less correct. A better way to think about it is that C# allows reordering at any point where the reordering would be invisible to a single threaded program, subject to certain constraints. Those constraints include introducing acquire and release semantics in certain cases, and preserving certain side effects at certain critical points.

Chris Brumme (from the CLR team) ...

The late great Chris's articles are gems and give a great deal of insight into the early days of the CLR, but I note that there have been some strengthenings of the memory model since 2003 when that article was written, particularly with respect to the issue you raise.

Chris is right that double-checked locking is super dangerous. There is a correct way to do double-checked locking in C#, and the moment you depart from it even slightly, you are off in the weeds of horrible bugs that only repro on weak memory model hardware.

does this mean that the other thread might read memory that still contains data from garbage-collected objects

I think your question is not specifically about the old weak ECMA memory model that Chris was describing, but rather about what guarantees are actually made today.

It is not possible for re-orderings to expose the previous state of objects. You are guaranteed that when you read a freshly-allocated object, its fields are all zeros.

This is made possible by the fact that all writes have release semantics in the current memory model; see this for details:

http://joeduffyblog.com/2007/11/10/clr-20-memory-model/

The write that initializes the memory to zero will not be moved forwards in time with respect to a read later.

I've always been confused by "partially constructed objects"

Joe discusses that here: http://joeduffyblog.com/2010/06/27/on-partiallyconstructed-objects/

Here the concern is not that we might see the pre-allocation state of an object. Rather, the concern here is that one thread might see an object while the constructor is still running on another thread.

Indeed, it is possible for the constructor and the finalizer to be running concurrently, which is super weird! Finalizers are hard to write correctly for this reason.

Put another way: the CLR guarantees you that its own invariants will be preserved. An invariant of the CLR is that newly allocated memory is observed to be zeroed out, so that invariant will be preserved.

But the CLR is not in the business of preserving your invariants! If you have a constructor which guarantees that field x is true if and only if y is non-null, then you are responsible for ensuring that this invariant is always observed to be true. If in some way this is observed by two threads, then one of those threads might observe the invariant being violated.

Thanks Eric for the detailed reply, which addresses all my questions from the POV of the CLR memory model. However, I am interested to know the state of affairs for other runtime implementations, such as CoreCLR running on ARM. Has there been a revised spec that mandates such implementations enforce a stronger memory model, meeting your guarantees, or could a conforming implementation still cause the issues I describe? The closest answer I've gotten is [Hans's comment](https://stackoverflow.com/q/51180784/1149773#comment89442445_51180784) that the model has been changed but not documented. — Douglas, Jul 25 '18 at 19:25
@Douglas: The history of the memory model and its documentation is confusing and arcane, and I am unfortunately no expert on it. I do not know of any revised spec. That said, I would be *incredibly surprised* if any conforming implementation of the CLR allowed you to *ever* read the previous contents of memory from code in the safe subset. That seems insanely dangerous from both a correctness and a security perspective, and I would expect that this would be prevented. — Eric Lippert, Jul 25 '18 at 22:43
@Douglas: For example, consider ASP.NET. Yes, it supports process isolation if you want it, but a by-design scenario is that some hosting company is hosting both coke.com and pepsi.com, and the back-end code for both could be running in the same process. If there was a way for one web site running in the safe subset to see the not-yet-cleaned-up garbage of another, that would be front page news; the CLR is not supposed to be the new Heartbleed vector. :-) — Eric Lippert, Jul 25 '18 at 22:45

Can memory reordering cause C# to access unallocated memory?

1 Answers1