43

I'm refactoring my libraries to use Span<T> for avoiding heap allocations if possible but as I target also older frameworks I'm implementing some general fallback solutions as well. But now I found a weird issue and I'm not quite sure whether I found a bug in .NET Core 3 or am I doing something illegal.

The issue:

// This returns 1 as expected but cannot be used in older frameworks:
private static uint ReinterpretNew()
{
    Span<byte> bytes = stackalloc byte[4];
    bytes[0] = 1; // FillBytes(bytes);

    // returning bytes as uint:
    return Unsafe.As<byte, uint>(ref bytes.GetPinnableReference());
}

// This returns garbage in .NET Core 3.0 with release build:
private static unsafe uint ReinterpretOld()
{
    byte* bytes = stackalloc byte[4];
    bytes[0] = 1; // FillBytes(bytes);

    // returning bytes as uint:
    return *(uint*)bytes;
}

Interestingly enough, ReinterpretOld works well in .NET Framework and in .NET Core 2.0 (so I could be happy with it after all), still, it bothers me a bit.

Btw. ReinterpretOld can be fixed also in .NET Core 3.0 by a small modification:

//return *(uint*)bytes;
uint* asUint = (uint*)bytes;
return *asUint;

My Question:

Is this a bug or does ReinterpretOld work in older frameworks only by accident and should I apply the fix also for them?

Remarks:

  • The debug build works also in .NET Core 3.0
  • I tried to apply [MethodImpl(MethodImplOptions.NoInlining)] to ReinterpretOld but it had no effect.
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
György Kőszeg
  • 17,093
  • 6
  • 37
  • 65
  • 2
    FYI: `return Unsafe.As(ref bytes[0]);` or `return MemoryMarshal.Cast(bytes)[0];` - no need to use `GetPinnableReference()`; looking into the other bit, though – Marc Gravell Nov 26 '19 at 13:18
  • [SharpLab](https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIGYACMhgYQYG8aHuGAHKAJYA3bBhhNySBgFcAdrmwAzcdIGyMDAEow1YqPxgYA8gBsAJgAoAlFx6dqPRw2ABPMQCpnbmLgYBeBlwMbDAAa2wTEwgwLzEAbRQAXQBuWydYnziABkT/BnJUh3SmAHYGdwtVdXcrVzFcQscAXzSGVv5hUXFiSRl5JRVdLR11GH1YY3NSa1b7YozPOp88oJDwyOiMhJTWxyXcbNyAgt2eKoxPPABVIYDK3Rr9xuLiMvdr3WfuFuomoA===) in case it helps anyone else. The two versions which avoid `Span` do compile to different IL. I don't *think* you're doing anything invalid: I suspect a JIT bug. – canton7 Nov 26 '19 at 13:23
  • what is the garbage that you are seeing? are you using the hack to disable locals-init? this hack *significantly* impacts `stackalloc` (i.e. it doesn't wipe the space allocated) – Marc Gravell Nov 26 '19 at 13:23
  • @canton7 if they compile to the same IL, we can't infer it is a JIT bug... if the IL is the same, etc... sounds more like a compiler bug, if anything, perhaps with an older compiler? György : can you indicate exactly how you're compiling this? what SDK, for example? I cannot repro the garbage – Marc Gravell Nov 26 '19 at 13:25
  • The SDK version number is 3.0.100. The garbage: I get different numbers for each run. I suspect that the array is not initialized with zeros in release build (though then the applied fix was illegal, too). – György Kőszeg Nov 26 '19 at 13:27
  • @canton7 oh, my bad; sorry – Marc Gravell Nov 26 '19 at 13:27
  • @GyörgyKőszeg and to confirm - you aren't using the locals-init hack? also: what are the older frameworks that you want to target? span goes down to net45, for example – Marc Gravell Nov 26 '19 at 13:28
  • @MarcGravell: No other hacks than in the provided sample. I also target framework 3.5 and 4.0. – György Kőszeg Nov 26 '19 at 13:30
  • 1
    It looks like stackalloc doesn't always zero, actually: [link](https://github.com/dotnet/coreclr/issues/1279#issuecomment-124003439) – canton7 Nov 26 '19 at 13:30
  • @canton7: Ahh, a very good finding, thank you. – György Kőszeg Nov 26 '19 at 13:35
  • @canton7 that's .... not obvious and unexpected! but indeed: "manually wipe your `stackalloc`" would seem to be the order of the day – Marc Gravell Nov 26 '19 at 13:35
  • And actually the verdict was "C# will keep zero initializing by default. Changing the default would be too breaking.". So maybe I will open a new issue for them. – György Kőszeg Nov 26 '19 at 13:38
  • @canton7 you should add an answer, IMO; I could do it, but: it is your "find" – Marc Gravell Nov 26 '19 at 13:38
  • "I also target framework 3.5 and 4.0" - I understand where you're coming from - until very recently, some of my libs offered builds right down to net20; but frankly: it is probably time to cut the ties there. My recommendation is netstandard2.0, adding net462 (or ideally, net472) and/or netcoreapp3.1 etc if there are APIs in them that you can make use of to optimize pieces – Marc Gravell Nov 26 '19 at 14:23
  • You know what you mean. I have a VS debugger component that still supports Windows XP and VS2008. Are you trying to say it's time to let them go? :) – György Kőszeg Nov 26 '19 at 14:31

1 Answers1

36

Ooh, this is a fun find; what is happening here is that your local is getting optimized away - there are no locals remaining, which means that there is no .locals init, which means that stackalloc behaves differently, and does not wipe the space;

private static unsafe uint Reinterpret1()
{
    byte* bytes = stackalloc byte[4];
    bytes[0] = 1;

    return *(uint*)bytes;
}

private static unsafe uint Reinterpret2()
{
    byte* bytes = stackalloc byte[4];
    bytes[0] = 1;

    uint* asUint = (uint*)bytes;
    return *asUint;
}

becomes:

.method private hidebysig static uint32 Reinterpret1() cil managed
{
    .maxstack 8
    L_0000: ldc.i4.4 
    L_0001: conv.u 
    L_0002: localloc 
    L_0004: dup 
    L_0005: ldc.i4.1 
    L_0006: stind.i1 
    L_0007: ldind.u4 
    L_0008: ret 
}

.method private hidebysig static uint32 Reinterpret2() cil managed
{
    .maxstack 3
    .locals init (
        [0] uint32* numPtr)
    L_0000: ldc.i4.4 
    L_0001: conv.u 
    L_0002: localloc 
    L_0004: dup 
    L_0005: ldc.i4.1 
    L_0006: stind.i1 
    L_0007: stloc.0 
    L_0008: ldloc.0 
    L_0009: ldind.u4 
    L_000a: ret 
}

I think I'd be happy to say that this is a compiler bug, or at least: an undesirable side-effect and behavior given that previous decisions have been put in place to say "emit the .locals init", specifically to try and keep stackalloc sane - but whether the compiler folks agree is up to them.

The workaround is: treat the stackalloc space as undefined (which, to be fair, is what you're meant to do); if you expect it to be zeros: manually zero it.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • 2
    It seems there is an [open ticket](https://github.com/dotnet/roslyn/issues/23951) for this. I'm going to add a new comment to that. – György Kőszeg Nov 26 '19 at 14:02
  • Huh, all my work and I didn't notice the first was missing `locals init`. Nice one. – canton7 Nov 26 '19 at 14:32
  • 1
    @canton7 if you're anything like me, you automatically skip past `.maxstack` and `.locals`, making it especially easy to not notice that that it is/isn't there :) – Marc Gravell Nov 26 '19 at 14:36
  • 1
    `The content of the newly allocated memory is undefined.` according to MSDN. The specification doesn't say that the memory should be zeroed either. So it looks like it only works on old framework by accident, or as a result of a non-contractual behaviour. – Luaan Nov 27 '19 at 08:14