structure layout optimization

Question

how big of a performance boost does byte optimization give you (making them multiples of 8,32,64, etc...)?

here is a sample structure:

[StructLayout(LayoutKind.Explicit)]
public struct RenderItem
{
   [FieldOffset(0)] byte[] mCoordinates = new byte[3]; //(x,y,z)
   [FieldOffset(3)] short  mUnitType;            

}

So my question is, how important is it to do something like this:

[StructLayout(LayoutKind.Explicit)]
public struct RenderItem
{
   [FieldOffset(0)] byte[] mCoordinates = new byte[3]; //(x,y,z)
   [FieldOffset(4)] short  mUnitType;
   [FieldOffset(6)] byte[] mPadding = new byte[2];     //make total to 8 bytes

}

I'm sure it's one of those things that 'scales with size', so in particular I'm curious about operations that would see this structure being used about 150,000 times for creating a VertexBuffer Object:

//int objType[,,] 3 dimensional int with object type information stored in it

int i = 0;
RenderItem vboItems[16 * 16 * 16 * 36]  //x - 16, y - 16, z - 16, 36 verticies per object

For(int x = 0; x < 16; x++)
{
     For(int y = 0; y < 16; y++)
     {
          For(int z = 0; z < 16; z++)
          {
               vboItems[i++] = (x,y,z,objType[x,y,z]);
          }
     }
 }

 //Put vboItems into a VBO

Check out http://www.alexonlinux.com/aligned-vs-unaligned-memory-access. Also, if you don't specify `FieldOffset`, the compiler will automatically align your structures for you. — Matthew, Nov 28 '12 at 16:28
it will automatically align them, but will it guarantee the order they are in? I've seen in several places that it's not guaranteed (because it's going into OpenGL the order it's sent is important) — David Torrey, Nov 28 '12 at 16:30

score 13 · Accepted Answer · edited May 23 '17 at 11:59

I'll assume you applied the [MarshalAs] attribute to make the array a ByValArray, only thing that makes sense on a structure like this. You are in fact making it slower by making the struct 2 bytes larger. That will use the processor's caches less efficient, fewer structs will fit when you use them in an array, very important to perf.

The default StructLayoutAttribute.Pack value of 8 is already optimized to give the best possible layout of the structure. It does not in fact have any effect on your struct, the members are already aligned optimally regardless of the Pack value. The rules for any modern processor to get the best perf:

a member should be aligned to an address that's divisible by the member size. This may add padding bytes in between members. This rule prevents the processor from having to multiplex the byte values from a memory read or performing two reads and glue the bytes together. Not an issue on your struct, the only member that requires alignment is mUnitType, it must be aligned at 2 and it already is aligned at 4. Also note that you don't have to use [FieldOffset], the default layout is already good.
a member should be aligned properly when the struct is used in an array. This may add packing to the end of the struct to get the next element in the array properly aligned. Again not an issue on your struct, it is 6 bytes long so the next element in an array will have its mUnitType aligned since it only requires 2. If you in fact declared the array without [MarshalAs] then the jitter will automatically add 2 bytes of padding without your help to ensure that the array pointer is aligned correctly.
a member should never straddle a cpu cache line. Which is 64 bytes on any modern processor I know. Very detrimental to perf, the cpu has to read two cache lines worth of data and always glue the bytes together, perf hit is around x3 slower. This is something that may happen on a 32-bit machine when a struct contains a member of size 8 or greater. So a long, double or decimal. Not only the alignment of the members matter, also where the structure is allocated in memory. That's a bit of a problem on the x86 version of .NET, it can only guarantee that the start address is aligned to a multiple of 4 for data allocated from the stack or the GC heap. Not an issue for x64. And not an issue for your struct, it only contains small members that can never straddle the cpu cache line.

So by these rules you don't have to help, the struct is already optimal as-is, even without LayoutKind.Explicit.

One other consideration applies, one that doesn't have anything to do with alignment. A short is not an optimal data type for a 32-bit or 64-bit processor. If you do anything beyond simple loads and stores then extra overhead is required to convert it from 16 to 32-bits. The background story behind that one is here. You now need to balance better CPU cache usage against less efficient operations, something you can only do reliably with a profiler.

I know cheering is frowned upon SO, but superb answer. – Dave Jellison Jan 31 '19 at 01:24 — Dave Jellison, Jan 31 '19 at 01:24

score 2 · Answer 2 · answered Nov 28 '12 at 17:07

That does not work the way it seems you think it does.

[StructLayout(LayoutKind.Explicit)]
public struct RenderItem
{
   [FieldOffset(0)] byte[] mCoordinates = new byte[3]; //(x,y,z)
   [FieldOffset(3)] short  mUnitType;            

}

An array is a reference type. The storage requirement of mCoordinates is IntPtr.Size (i.e. 4 bytes on x86 and 8 bytes on x64). The 3 byte elements are stored on the heap.

I don't know what bad things might (or might not) happen if you FieldOffset overlapping a reference like that.

If you need a structure with that exact layout, you need to create another value type

[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct Coordinate
{
    byte x;
    byte y;
    byte z;
};


[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct RenderItem
{
   Coordinate coord;
   short  mUnitType;            

}

This doesn't answer your question about alignment, but the link provided by Matthew does.

However, since C#2 we can define an 'unsafe' struct and used the 'fixed' keyword on the array field, with a length defined at compile time. This results in a struct that contains the array elements/data, rather than a ref to an array object. — redcalx, Dec 09 '17 at 22:03

structure layout optimization

2 Answers2