4

I have big arrays of KeyValuePair<DateTime,decimal>. I know that in memory the array is contiguous since KVP is a value type, DateTime is effectively an Int64, and decimal is an array of 4 ints (and that won't change). However, DateTime is not blittable, and decimal is not primitive.

Is there any way to abuse type system and get an unsafe pointer to the array and work with it as bytes? (GCHandle.Alloc cannot work with these two types when they are a part of a structure, but works OK with arrays of those type.)

(If you are interested why, I convert the array now manually to what I believe is 1-to-1 byte[] representation, and it is slow)

V.B.
  • 6,236
  • 1
  • 33
  • 56

2 Answers2

3

Finally, there is a public tool: System.Runtime.CompilerServices.Unsafe package.

Below is a passing test:

using System.Runtime.CompilerServices.Unsafe;
[Test]
public unsafe void CouldUseNewUnsafePackage() {
    var dt = new KeyValuePair<DateTime, decimal>[2];
    dt[0] = new KeyValuePair<DateTime, decimal>(DateTime.UtcNow.Date, 123.456M);
    dt[1] = new KeyValuePair<DateTime, decimal>(DateTime.UtcNow.Date.AddDays(1), 789.101M);
    var obj = (object)dt;
    byte[] asBytes = Unsafe.As<byte[]>(obj);
    //Console.WriteLine(asBytes.Length); // prints 2
    fixed (byte* ptr = &asBytes[0]) {
        // reading this: https://github.com/dotnet/coreclr/issues/5870
        // it looks like we could fix byte[] and actually KeyValuePair<DateTime, decimal> will be fixed
        // because:
        // "GC does not care about the exact types, e.g. if type of local object 
        // reference variable is not compatible with what is actually stored in it, 
        // the GC will still track it fine."
        for (int i = 0; i < (8 + 16) * 2; i++) {
            Console.WriteLine(*(ptr + i));
        }
        var firstDate = *(DateTime*)ptr;
        Assert.AreEqual(DateTime.UtcNow.Date, firstDate);
        Console.WriteLine(firstDate);
        var firstDecimal = *(decimal*)(ptr + 8);
        Assert.AreEqual(123.456M, firstDecimal);
        Console.WriteLine(firstDecimal);
        var secondDate = *(DateTime*)(ptr + 8 + 16);
        Assert.AreEqual(DateTime.UtcNow.Date.AddDays(1), secondDate);
        Console.WriteLine(secondDate);
        var secondDecimal = *(decimal*)(ptr + 8 + 16 + 8);
        Assert.AreEqual(789.101M, secondDecimal);
        Console.WriteLine(secondDecimal);
    }
}
V.B.
  • 6,236
  • 1
  • 33
  • 56
  • The exact constructs that are safe are still a bit unclear. `of course, you have to know what you are doing` and he does not advise on this. I'd be comfortable using your particular code in production now but only because I don't see how it could go wrong. This is a weak form of evidence that it's safe... – usr Aug 16 '16 at 17:58
  • The comment that I copied tells when it is "safe", I asked [here](https://github.com/dotnet/coreclr/issues/5870) for confirmation from .NET team. – V.B. Aug 16 '16 at 18:14
  • The comment says that a particular concern is not relevant. It does not say that the code is completely safe. In fact it introduces a specific concern (memory layout) and suggest that the code might not be future compatible. Also, going beyond this particular piece of code we still don't know the rules under which unsafe operations are safe. – usr Aug 16 '16 at 18:57
  • @usr still it is better than `__makeref`-like stuff. I already use conditionals `#if NET451` sometimes for the case when I am sure about how a runtime behaves, so this could be used inside such conditionals with huge perf benefits. – V.B. Aug 16 '16 at 19:51
  • @V.B. Is there a way to make this work for multidimensional arrays as well? Like when the type is a 3D array for example `new KeyValuePair[,,]` (not a jagged array!) – Riki May 02 '19 at 07:43
  • 1
    @riki 3D arrays have a different object header so you have to unsafely cast to `byte[,]` at least. But maybe there is something else. – V.B. May 03 '19 at 11:05
1

I just tested that unsafe and GCHandle.Alloc don't work (as you suggested). There is a terribly unsafe hack to still do this. I don't know if this is safe with the current CLR. It certainly is not guaranteed to work in the future.

You can convert an object reference of any type to any other reference type in IL. That IL will not be verifiable. The JIT tends to accept quite a few non-verifiable constructs. Maybe this is because they wanted to support Managed C++.

So you need to generate a DynamicMethod that roughly has the following IL:

static T UnsafeCast(object value) {
 ldarg.1 //load type object
 ret //return type T
}

I think this should work...

Or, you can call System.Runtime.CompilerServices.JitHelpers.UnsafeCast<T> using Reflection.

This is a dangerous tool... I would not use it in production code.

usr
  • 168,620
  • 35
  • 240
  • 369
  • IL to the rescue! Thanks! Why it is dangerous for the specific case when memory layout is known and won't change? – V.B. Sep 30 '15 at 11:08
  • This might mess up the GC and JIT optimizations. You are storing a T2 in a variable being typed as T1. Who knows what kinds of consequences this can have?! Note, that this is possible using supported means for byte[] and sbyte[]. So there is at least one case where the CLR already supports this. – usr Sep 30 '15 at 11:17
  • where in CLR byte[] and sbyte[] are casted? – V.B. Apr 07 '16 at 18:05
  • `(sbyte[])(object)new byte[1]` compiles and runs. When you call `GetType()` you get `byte[]` from an `sbyte[]` variable. That breaks the C# spec but the CLR allows it. – usr Apr 07 '16 at 18:25
  • Thanks! I cannot access System.Runtime.CompilerServices.JitHelpers.UnsafeCast via reflection, but IL works. If I will keep returned `T` together with `input object` and destroy them in reverse - first T instance, then the original one, - do you think there could still be risks to break the entire CLR? I only need to avoid copying and then access the T array by index, as if I used unsafe and pointer casts. – V.B. Apr 07 '16 at 18:38
  • Of course, this is very unsafe. Maybe, though, you can use reflection emit to stick that object reference into a `fixed` local. That should fix its address and you can then use the managed pointer differently. This would be more safe although still not guaranteed to work. It's more safe because the GC never sees "bad" references. – usr Apr 07 '16 at 18:46
  • Unsafe cast is now public, please see my answer. – V.B. Aug 16 '16 at 16:26