0

I need to do complex marshaling of several nested structures, containing variable length arrays to other structures, hence I decided to use ICustomMarshaler (see for a good JaredPar's tutorial here). But then I have a problem with a struct defined in C++ as:

typedef struct AStruct{
    int32_t     a;
    AType*      b;
    int32_t     bLength;
    bool        aBoolean;
    bool        bBoolean;
};

On the C# side, in the MarshalManagedToNative implementation of ICustomMarshaler I was using:

Marshal.WriteByte(intPtr, offset, Convert.ToByte(aBoolean));
offset += 1;
Marshal.WriteByte(intPtr, offset, Convert.ToByte(bBoolean));

But it was not working since I discovered that each bool in the C++ struct was taking 2 bytes. Indeed in x86 sizeof(AStruct) = 16, not 14. Ok, bool is not guaranteed to take 1 byte and so I tried with unsigned char and uint8_t but still the size is 16.

Now, I know I could use an int32 instead than a boolean, but since I care about the taken space and there are several structs containing boolean that flow to disk (I use HDF5 file format and I want to map those boolean with H5T_NATIVE_UINT8 defined in the HDF5 library that takes 1 byte), is there another way? I mean can I have something inside a struct that is guaranteed to take 1 byte?

EDIT

the same problem applies also to int16 values: depending on how many values are present because of alignment reasons the size of the struct at the end might be different from what expected. On the C# side I do not "see" the C++ struct, I simply write on the unmanaged memory by following the definition of my structs in C++. It is quite a simple process, but if I have instead to think to the real space taken by the struct (either by guessing or by measuring it) it will become more difficult and prone to errors every time I modify the struct.

Mauro Ganswer
  • 1,379
  • 1
  • 19
  • 33
  • C++ structures have a structure packing, which can have several options. The default structure packing for MSVC++ is to optimise for speed, which mostly entails aligning on 2 or 4 byte boundaries. If you have control of the C++ project and are working in MSVC then you can use the `#pragma pack` command to modify the packing rules for the structure. – MicroVirus May 22 '14 at 11:37
  • thanks @MicroVirus. I have control on the C++ project and I'll try to use the #pragma pack you suggested. If I set it to 4 bytes, will it work for any struct and for any architecture (32 and 64bit)? – Mauro Ganswer May 22 '14 at 11:41
  • This SO answer might help you: http://stackoverflow.com/a/3318475/2718186 and MSDN http://msdn.microsoft.com/en-us/library/2e70t5y1.aspx – MicroVirus May 22 '14 at 11:44

2 Answers2

2

sizeof(AStruct) = 16, not 14

That's correct. The struct has two extra bytes at the end that are not used. They ensure that, if you put the struct in an array, that the fields in the struct are still properly aligned. In 32-bit mode, the int32_t and AType* members require 4 bytes and should be aligned to a multiple of 4 to allow the processor to access them quickly. That can only be achieved if the structure size is a multiple of 4. Thus 14 is rounded up to 16.

Do keep in mind that this does not mean that the bool fields take 2 bytes. A C++ compiler uses just 1 byte for them. The extra 2 bytes are pure padding.

If you use Marshal.SizeOf(typeof(AStruct)) in your C# program then you'll discover that the struct you declared takes 20 bytes. This is not good and the problem you are trying to fix. The bool members are the problem, an issue that goes way, way, back to early versions of the C language. Which did not have a bool type. The default marshaling that the CLR uses is compatible with BOOL, the typedef in the winapi. Which is a 32-bit type.

So you have to be explicit about it when you declare the struct in your C# code, you have to tell the marshaller that you want the 1-byte type. Which you do by declaring the struct member as byte. Or by overriding the default marshaling:

[StructLayout(LayoutKind.Sequential)]
private struct AStruct{
    public int    a;
    public IntPtr b;
    public int    bLength;
    [MarshalAs(UnmanagedType.U1)]
    public bool   aBoolean;
    [MarshalAs(UnmanagedType.U1)]
    public bool   bBoolean;
}

And you'll now see that Marshal.SizeOf() now returns 16. Do be aware that you have to force your program in 32-bit mode, make sure that the EXE project's Platform Target setting is x86.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • I do not have AStruct defined in C#, I do completely manual marshaling (Marshal.WriteXXX methods) and the problem was arising when I had an array of AStruct, since I was not taking care of the padding. Now thanks to your information on padding it works, however my app must work also on 64bit machine. In 64bit platforms are the structs always padded to multiple of 8? Moreover some of our customers use it on Linux via Mono, do you see any problem there? – Mauro Ganswer May 22 '14 at 11:38
  • Layout will be very different, the AType* pointer requires alignment to 8. So there's 4 bytes of padding after the `a` member and still 2 at the end for a total structure size of 24. This "manual marshaling" is a pretty big code smell, I don't want to predict what is going to happen on another operating system. – Hans Passant May 22 '14 at 11:45
2

This answer is in addition to what Hans Passant has said.

It might be easiest to have your structures use a fixed packing size, so you can readily predict the member layout. Keep in mind though that this could affect performance.

The rest of this answer is specific to Microsoft Visual C++, but most compilers offer their own variant of this.

To get you started, check out this SO answer #pragma pack effect and MSDN http://msdn.microsoft.com/en-us/library/2e70t5y1.aspx

What you often use is a pragma pack(push, ...) followed by a pragma pack(pop, ...) idiom to only affect packing for the structures defined between the two pragma's:

#pragma pack(push, 4)
struct someStructure
{
    char a;
    int b;
    ...
};
#pragma pack(pop)

This will make someStructure have a predictable packing of 4 byte-alignment of each of its members.

EDIT: From the MSDN page on packing

The alignment of a member will be on a boundary that is either a multiple of n or a multiple of the size of the member, whichever is smaller.

So for pack(4) a char will be aligned on a 1-byte boundary, a short on a 2-byte, and the rest on a 4-byte boundary.

Which value is best depends on your situation. You'll need to explicitly pack all structures you intend to access, and probably all structures that are members of structures you want to access.

Community
  • 1
  • 1
MicroVirus
  • 5,324
  • 2
  • 28
  • 53
  • ok, with pack(4) it seems to work both on 32 and 64bit. I hope that in any case the padding will always be put at the end of the struct and not also in the middle as it is suggested by this post: http://stackoverflow.com/a/3318475/2718186. A minor thing which I do not understand is why when I have a struct containing values for 10 bytes and I set pack(8) its real size if 12 even on 64bit... – Mauro Ganswer May 22 '14 at 12:03
  • @Ganswer: With `pack(4)` it becomes a fixed packing rule, independent of target platform. The packing bytes are put in the middle and at the end of the structure. For examples, see http://msdn.microsoft.com/en-us/library/71kf49f1.aspx – MicroVirus May 22 '14 at 12:11
  • but if they are also in the middle how can I predict where to write inside the unmanaged memory?? Is there a rule to follow? – Mauro Ganswer May 22 '14 at 12:29
  • 1
    ok, thanks now it works. I marked yours as the reply, but a reader interested to the subject should take care also of Hans Passant suggestions – Mauro Ganswer May 23 '14 at 08:52