2

Question

I tried building a struct (SA) using [StructLayout(LayoutKind.Explicit)], which had a field which is another struct (SB).

First: I was surprised I was allowed to declare that other struct without [StructLayout(LayoutKind.Explicit)], whereas in SA, all fields must have [FieldOffset(0)], or the compiler will shout. It doesn't make much sense.

  • Is this a loophole in the compiler's warnings/errors ?

Second: it seems that all reference (object) fields in SB are moved to the front of SB.

  • Is this behaviour described anywhere?
  • Is it implementation-dependant?
  • Is it defined anywhere that it is implementation-dependant? :)

Note: I'm not intending to use this in production code. I ask this question mainly out of curiosity.

Experimentation

// No object fields in SB
// Gives the following layout (deduced from experimentation with the C# debugger):

// | f0 | f4 and i | f8 and j | f12 and k | f16 |

[StructLayout(LayoutKind.Explicit)]
struct SA {
    [FieldOffset(0)] int f0;
    [FieldOffset(4)] SB sb;
    [FieldOffset(4)] int f4;
    [FieldOffset(8)] int f8;
    [FieldOffset(12)] int f12;
    [FieldOffset(16)] int f16;
}
struct SB { int i; int j; int k; }

// One object field in SB
// Gives the following layout:

// | f0 | f4 and o1 | f8 and i | f12 and j | f16 and k |

// If I add an `object` field after `j` in `SB`, i *have* to convert
// `f4` to `object`, otherwise I get a `TypeLoadException`.
// No other field will do.

[StructLayout(LayoutKind.Explicit)]
struct SA {
    [FieldOffset(0)] int f0;
    [FieldOffset(4)] SB sb;
    [FieldOffset(4)] object f4;
    [FieldOffset(8)] int f8;
    [FieldOffset(12)] int f12;
    [FieldOffset(16)] int f16;
}
struct SB { int i; int j; object o1; int k; }

// Two `object` fields in `SB`
// Gives the following layout:

// | f0 | f4 and o1 | f8 and o2 | f12 and i | f16 and j | k |

// If I add another `object` field after the first one in `SB`, i *have* to convert
// `f8` to `object`, otherwise I get a `TypeLoadException`.
// No other field will do.

[StructLayout(LayoutKind.Explicit)]
struct SA {
    [FieldOffset(0)] int f0;
    [FieldOffset(4)] SB sb;
    [FieldOffset(4)] object f4;
    [FieldOffset(8)] object f8;
    [FieldOffset(12)] int f12;
    [FieldOffset(16)] int f16;
}
struct SB { int i; int j; object o1; object o2; int k; }
Suzanne Soy
  • 3,027
  • 6
  • 38
  • 56

2 Answers2

5

Is this a loophole in the compiler's warnings/errors ?

No, nothing wrong with it. Fields are allowed to overlap, this is why LayoutKind.Explicit exists in the first place. It allows declaring the equivalent of a union in unmanaged code, not otherwise supported in C#. You cannot suddenly stop using [FieldOffset] in a structure declaration, the runtime insist that you use it on all members of the struct. Not technically necessary but a simple requirement that avoids wrong assumptions.

it seems that all reference (object) fields in SB are moved

Yes, this is normal. The CLR lays out objects in an undocumented and undiscoverable way. The exact rules it uses are not documented and subject to change. It also won't repeat for different jitters. Layout doesn't become predictable until the object is marshaled, Marshal.StructureToPtr() call or implicitly by the pinvoke marshaller. Which is the only time the exact layout matters. I wrote about the rationale for this behavior in this answer.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • If I'm not mistaken, the struct also has the "marshalled-like layout" when we cast a raw unsafe pointer to a pointer to that struct, doesn't it ? Say, in `unsafe { SA* sa = (SA*)0x1234; /* Use sa here */ }`, `sa` will have the same layout as when marshalling it, won't it ? – Suzanne Soy Mar 13 '13 at 18:50
  • No, that's not marshaling. The compiler won't let you do that, the struct is not blittable. Try it. – Hans Passant Mar 13 '13 at 19:00
  • Indeed, it works with structs only containing ints, but not when I add an object field. However the question still holds for structs that only containt `int`s and such, like the first one I used. – Suzanne Soy Mar 13 '13 at 19:18
1

The answer to the first question is no, there's no loophole or bug in the compiler's error reporting. If you start doing explicit layout, the compiler is going to assume that you know what you're doing (within limits--see below). You told it to overlay one structure on top of another. The compiler doesn't (and shouldn't) care that the structure you're overlaying isn't also explicitly laid out.

If the compiler did care, then you wouldn't be able to overlay any type that wasn't explicitly laid out, meaning that you couldn't do a union in the general case. Consider, for example, trying to overlay a DateTime and a long:

[StructLayout(LayoutKind.Explicit)]
struct MyUnion
{
    [FieldOffset(0)]
    public bool IsDate;
    [FieldOffset(1)]
    public DateTime dt;
    [FieldOffset(1)]
    public long counter;
}

That wouldn't compile unless DateTime were explicitly laid out. Probably not what you want.

As far as putting reference types in explicitly laid out structures goes, your results are going to be ... probably not what you expected. Consider, for example, this simple bit:

struct MyUnion
{
    [FieldOffset(0)]
    public object o1;
    [FieldOffset(0)]
    public SomeRefType o2;
}

That violates type safety in a big way. If it compiles (which it very well might), it will die with a TypeLoadException when you try to use it.

The compiler will prevent you from violating type safety where possible. I don't know if the compiler knows how to process those attributes and layout the structure, or if it just passes the layout information to the runtime through the generated MSIL. Probably the latter, considering your second example, where the compiler allowed a particular layout but the runtime bombed with a TypeLoadException.

A Google search on [structlayout.explicit reference types] reveals some interesting discussions. See Overlaying several CLR reference fields with each other in explicit struct?, for example.

Community
  • 1
  • 1
Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • 1
    There is no TypeLoadException for the MyUnion struct. It will only go belly-up when you overlap a value type value with a reference type object. Like the OP did. Try it. – Hans Passant Mar 13 '13 at 18:37
  • That seems ... dangerous. Or perhaps not. Hmmmm . . . I guess it could be safe to overlay two reference types. Runtime exception will occur if I try to dereference `o2` after setting `o1` to something that's not a `SomeRefType`. – Jim Mischel Mar 13 '13 at 18:42
  • This seems dangerous to me: you can overlay two arrays with different types. This allows you to access memory beyond the smaller array region. – Ark-kun Nov 24 '14 at 01:29
  • @Ark-kun: The runtime will prevent overlaying a reference type with a value type. If you're overlaying two value types, the amount of memory allocated will be the larger of the two. So the scenario you allude to isn't a problem. – Jim Mischel Nov 24 '14 at 12:34
  • @JimMischel Overlaying two arrays is not "overlaying a reference type with a value type". Arrays are references in .Net. Overlaying arrays allows to circumvent the CLR type checks and array bounds protection and corrupt memory. – Ark-kun Nov 25 '14 at 00:27
  • @Ark-kun: Arrays in .NET are reference types. So, again, there is no danger of circumventing type safety. – Jim Mischel Nov 25 '14 at 03:33
  • @JimMischel It's all the matter of naming. We have a pointer to a 1024 byte location. This pointer can be treated as both byte[1024] and int[1024]. In the latter case, only the first 256 array elements point to the allocated 1024-byte memory block. Writing to the higher elements correspond to accessing memory outside the allocated area. You're rewriting the memory that doesn't belong to your variables. No matter how you call it, it's circumventing the safe memory safety which is even worse than circumventing type safety. Why not try it? – Ark-kun Nov 25 '14 at 19:01
  • @Ark-kun: *Arrays are reference types*. An array reference in that structure will occupy `sizeof(IntPtr)` bytes. The array itself is stored in a separate block of memory. No chance of violating type safety. Now, if you're talking about [fixed sized buffers](http://msdn.microsoft.com/en-us/library/zycewsya.aspx), you have a legitimate concern. But those are only allowed in an unsafe context, for which the concerns you raise already exist. – Jim Mischel Nov 25 '14 at 19:34
  • @JimMischel "Arrays are reference types." - Yes. "An array reference in that structure will occupy sizeof(IntPtr) bytes." - Yes. "The array itself is stored in a separate block of memory." - Yes. The elements are store in a heap block. Alongside all of the other heap objects belonging to your process. "Now, if you're talking about fixed sized buffers" - No, just normal managed arrays in heap. "No chance of violating type safety." - Arbitrary rewriting memory that does not belong to your objects !in safe context! violates everything, including type safety. Don't you think so? – Ark-kun Nov 26 '14 at 02:27
  • @JimMischel Suppose you could successfully do `byte[] a = new byte[10]; a[2000] = 11;` If this worked, it would be disastrous, right? But that's exactly what I accomplished. – Ark-kun Nov 26 '14 at 02:30