Suggestions for interop'ing with size_t via PInvoke

Question

We have a native code SDK which predominantly uses the C/C++ size_t type for things like array sizes. We additionally provide a .NET wrapper (written in C#) which uses PInvoke to invoke the native code, for those that want to integrate our SDK into their .NET app.

.NET has the System.UIntPtr type which pairs perfectly with size_t functionally, and functionally everything works as expected. Some of the C# structures provided to the native side contain System.UIntPtr types and they're exposed to consumers of the .NET API which requires them to work with System.UIntPtr types. The problem is that System.UIntPtr does not interoperate well with typical integer types in .NET. Casts are required and various "basic" things like comparisons to integers/literals don't work without more casting.

We tried declaring the exported size_t params as uint and applying the MarshalAsAttribute(UnmanagedType.SysUInt) but that results in a runtime error for invalid marshaling. For example:

[DllImport("Native.dll", EntryPoint = "GetVersion")]
private static extern System.Int32 GetVersion(
    [Out, MarshalAs(UnmanagedType.LPStr, SizeParamIndex = 1)]
    StringBuilder strVersion,
    [In, MarshalAs(UnmanagedType.SysUInt)]
    uint uiVersionSize
);

Calling GetVersion in C# passing a uint for the 2nd param results in this marshal error at runtime:

System.Runtime.InteropServices.MarshalDirectiveException: Cannot marshal 'parameter #2': Invalid managed/unmanaged type combination (Int32/UInt32 must be paired with I4, U4, or Error).

We could create facade wrappers which expose 'int' types in .NET and internally do the casting to System.UIntPtr for native-compatible classes, but (a) we worry about performance of copying the buffers (which could be very large) between near-duplicate classes and (b) it's a bunch of work.

Any suggestions on how to PInvoke with size_t types while maintaining a convenient API in .NET?

Here's a sample of one case which is effectively the same as our real code but with simplified/stripped names. NOTE This code is derived from our production code by hand. It compiles for me, but I've not run it.

Native (C/C++) code:

#ifdef __cplusplus
extern "C"
{
#endif


enum Flags
{
    DEFAULT_FLAGS = 0x00,

    LEVEL_1 = 0x01,
};


struct Options
{
    Flags flags;

    size_t a;

    size_t b;

    size_t c;
};


int __declspec(dllexport) __stdcall InitOptions(
    Options * const pOptions)
{
    if(pOptions == nullptr)
    {
        return(-1);
    }

    pOptions->flags = DEFAULT_FLAGS;
    pOptions->a = 1234;
    pOptions->b = static_cast<size_t>(0xFFFFFFFF);
    pOptions->c = (1024 * 1024 * 1234);

    return(0);
}


#ifdef __cplusplus
}
#endif

Managed (C#) Code: (This should to repro the incorrect marshalling. Changing the fields a, b, and c in the struct to type UIntPtr makes it function properly.

using System;
using System.Runtime.InteropServices;

namespace Test
{
    public enum Flags
    {
        DEFAULT_FLAGS = 0x00,

        LEVEL_1 = 0x01,
    }


    [System.Runtime.InteropServices.StructLayoutAttribute(System.Runtime.InteropServices.LayoutKind.Sequential)]
    public struct Options
    {
        public Flags flags;

        public uint a;

        public uint b;

        public uint c;
    }


    public class Test
    {
        [DllImport("my.dll", EntryPoint = "InitOptions", CallingConvention = CallingConvention.StdCall)]
        internal static extern Int32 InitOptions(
            [In, Out]
            ref Options options
        );

        static void Main(string[] args)
        {
            Options options = new Options
            {
                flags = DEFAULT_FLAGS,
                a = 111,
                b = 222,
                c = (1024 * 1024 * 1)
            };

            Int32 nResultCode = InitOptions(
                ref options
            );

            if(nResultCode != 0)
            {
                System.Console.Error.WriteLine("Failed to initialize options.");
            }

            if(   options.flags != DEFAULT_FLAGS
                || options.a != 1234
                || options.b != static_cast<size_t>(-1)
                || options.c != (1024 * 1024 * 1234) )
            {
                System.Console.Error.WriteLine("Options initialization failed.");
            }
        }
    }

}

I tried changing the enum field in the managed struct to a int type and it still doesn't work.

I'll test more with size_t function params next.

`We tried declaring the exported size_t params as uint and applying the MarshalAsAttribute(UnmanagedType.SysUInt) but that results in a runtime error for invalid marshaling` It should work if you use `U4` for x86 or `U8` for x64, but you should be able to just pass `uint` or `ulong` (as appropriate) and let the default marshalling do the work. — Matthew Watson, Dec 15 '18 at 15:19
@MatthewWatson Our .NET assembly wrapper is built twice--once for 32-bit and once for 64-bit--with the same code . We'd like to preserve this. uint gives us the same marshaling error: System.Runtime.InteropServices.MarshalDirectiveException: Cannot marshal 'parameter #2': Invalid managed/unmanaged type combination (Int32/UInt32 must be paired with I4, U4, or Error). — codesniffer, Dec 15 '18 at 15:27
size_t is a [wart of history](https://stackoverflow.com/questions/10168079/why-is-size-t-unsigned). No reason to let it cramp your style, using *int* is almost always appropriate. The OS keeps you out of trouble, you can't allocate more than 2GB in one whack, even on the 64-bit version. — Hans Passant, Dec 15 '18 at 15:58
@SimonMourier uint causes a marshal error. I've edited the question to include this and the resulting error. — codesniffer, Dec 16 '18 at 12:24
I got that, but I meant uint (or int) w/o any MarshalAs parameter. It should work just like that. — Simon Mourier, Dec 16 '18 at 12:43
@SimonMourier We were excited and baffled to see it work with a uint type at first, then found that it was just a coincidence of testing with a single parameter. It does not work, which makes sense when you think about it. Declaring the type as a int/uint in C# is declaring it a 32-bit size. So when this is run in a 64-bit process only half of the needed sizes are passed over. We confirmed the values received and returned on the native side are wrong for anything but the most trivial case (single param). — codesniffer, Dec 28 '18 at 00:30
I tested it too. Can you share an example where it doesn't work? — Simon Mourier, Dec 28 '18 at 08:16
@SimonMourier unfortunately we're testing this with our actual software which we can't share. But the first case where it doesn't work is quite simple. On the managed side: A struct (attributed LayoutKind.Sequential) with 4 fields: an enum and 3 uints. This corresponds to a native struct with an enum field and 3 size_t fields. The struct is passed to a native function by itself (by ref of course) and must be an in/out param. C# sets the uints to 60, 10, 1048576, and native sets them to (-1), 1000, 33554432. Both sides receive it wrong with uint, and correct with UintPtr. — codesniffer, Dec 28 '18 at 12:58
You mean the size_t is not a method parameter, but a field inside a struct that's passed as a method parameter? This is a quite different matter and you should have stated that initially. Please post the exact piece of code. We don't need your whole project, but we need something that exactly corresponds to your question. — Simon Mourier, Dec 28 '18 at 19:15
@SimonMourier We use size_t in both structs and function params. The struct doesn't work for me. I've added corresponding sample code in the question. I'll test the same with function params next. I'm curious why these two are a "quite different matter"? — codesniffer, Dec 29 '18 at 05:00
Ok, so that's what I though. using uint or int won't work in structs because it changes the struct layout/offsets, of course. The universal binary equivalent of size_t is IntPtr (or UIntPtr). Period. My suggestion was just for method arguments, but it's more a trick. — Simon Mourier, Dec 29 '18 at 11:27
My last ditch thought on this is to create a custom marshaler for uint ->size_t, but I've only seen documentation and examples for marshaling non-POD types (arrays, objects, etc). Is it possible in .NET to create a custom marshaler for uint? — codesniffer, Dec 29 '18 at 14:57
@SimonMourier Thanks for your time & effort. Feel free to post your suggestions as an answer since it at least solves dealing with size_t in function params. — codesniffer, Dec 29 '18 at 14:58

Simon Mourier · Answer 1 · 2018-12-29T16:20:12.077

The binary equivalent to size_t is IntPtr (or UIntPtr). But for parameters, you can just use int or uint without any additional attribute.

So, if you have this in C/C++:

int InitOptions(size_t param1, size_t param2);

then you can declare it like this in C# and it will work for x86 and x64 (well, you won't get any bit value above 32 of course, the hi-uint is lost):

[DllImport("my.dll")]
static extern int InitOptions(int param1, int param2); // or uint

For x86 it works because, well, it's just supposed to.

For x64, it works magically because arguments are always 64-bit, and luckily, the extra hi-bits are zeroed by errrhh... some components of the system (the CLR? C/C++ compiler? I'm unsure).

For struct fields this a complete different story, the simplest (to me) seems to use IntPtr and add some helpers to ease programming.

However, I've added some extra sample code if you really want to add some sugar for the developers using your structs. What's important is this code could (should) be generated from the C/C++ definitions.

public static int InitOptions(ref Options options)
{
    if (IntPtr.Size == 4)
        return InitOptions32(ref options);

    Options64 o64 = options;
    var i = InitOptions64(ref o64);
    options = o64;
    return i;
}

[DllImport("my64.dll", EntryPoint = "InitOptions")]
private static extern int InitOptions64(ref Options64 options);

[DllImport("my32.dll", EntryPoint = "InitOptions")]
private static extern int InitOptions32(ref Options options);

[StructLayout(LayoutKind.Sequential)]
public struct Options // could be class instead (remove ref)
{
    public Flags flags;
    public uint a;
    public uint b;
    public uint c;

    public static implicit operator Options64(Options value) => new Options64 { flags = value.flags, a = value.a, b = value.b, c = value.c };
}

[StructLayout(LayoutKind.Sequential)]
public struct Options64 // could be class instead (remove ref)
{
    public Flags flags;
    public ulong a;
    public ulong b;
    public ulong c;

    public static implicit operator Options(Options64 value) => new Options { flags = value.flags, a = (uint)value.a, b = (uint)value.b, c = (uint)value.c };
}

Note that if you uses classes instead of struct for Options and Options64, you can remove all the ref argument directions and avoid the painful copy from structs (operator overloading doesn't work well with ref). But this has other implications, so it's up to you.

Here is another discussion on the same subject: C# conditional compilation based on 32-bit/64-bit executable target

Basically, what you could also do is use conditional compilation constants for x86 and x64 targets and have your code vary using that.

Can you double-check that defining the C# structs as classes properly marshals the data? I know we tried and it did not work, though there's been so many incarnations I don't remember whether it used uint or UIntPtr types. — codesniffer, Dec 29 '18 at 18:08
Yes using class instead of struct works fine, but you can't pass say a 32-bit Options variable to the 64-bit InitOptions64 method as is, even if it compiles thanks to the overloaded constructor. You must pass a 64-bit Options64 variable — Simon Mourier, Dec 30 '18 at 10:53
Thanks for posting this, but I can't accept as answer knowing now that this approach is subject to **silently** dropping data (for example in 64-bit build, if native code returns value greater than 2^32). — codesniffer, Jan 13 '19 at 19:24

codesniffer · Answer 2 · 2019-01-02T23:51:08.540

Here's what I ended up doing:

First some goals:

Expose .NET-friendly and customary types to .NET library users.
Avoid data being silently lost when interop'ing with native code.
Avoid propagating 32-bit/64-bit distinction to .NET library users (in other words, avoid having type differences outside my .NET API due to underlying native DLL bitness; strive for a single data type that (mostly) hides the bitness issue).
Nice to minimize having separate structures and/or code paths for 32-vs-64 bit.
Naturally all things developers prefer (less code to write & maintain, easier to keep in sync, etc).

FUNCTIONS

The C functions exported from the DLL are presented in the DllImport with .NET types as close as possible to the native (C) types. Then each function is wrapped with a more-inline-with-.NET facade.

This accomplished 2 things:

Preserving the native types in the DllImport avoids silent (!) data loss. As Simon Mourier pointed out, .NET can use uint in place of size_t in functions. While this seems to work, it also will silently drop data that's out of range. So if the native code returns a value larger than uint.MaxValue, our .NET code will never know. I'd rather handle the situation than have some spurious bug.
Various techniques and types which are specific to C and/or non-object oriented are presented in a style more native to .NET. For example, buffers in the C API which are presented as a byte pointer plus a size parameter are presented as simply byte arrays in .NET. Another example is non-zero-terminated strings (ex. UTF, XML) are presented in .NET as a String or Xml object instead of byte array and size parameters.

Specifically for size_t function params, they are presented as UIntPtr in the DllImport (per #1 above), and if still necessary to be exposed to the library user, they are presented as either uint or ulong as applicable. The facade then verifies the value of each (in/out as applicable) and throws an exception if there's an incompatibility.

Here's an example using pseudo-code:

C Function:

// Consume & return data in buf and pBufSize
int __declspec(dllexport) __stdcall Foo(
    byte * buf,
    size_t * pBufSize
);

C# DllImport:

[DllImport("my.dll", EntryPoint = "Foo", CallingConvention = CallingConvention.StdCall)]
private static extern System.Int32 Foo(
    [In, Out, MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 1)]
    System.Byte[] buf,
    [In, Out]
    ref System.UIntPtr pBufSize
);

C# Facade (pseudo-code):

void Foo(System.Byte[] buf)
{
    // Verify buffer size will fit
    if buf.LongLength > UIntPtrMaxValue
        throw ...

    UIntPtr bufSize = buf.LongLength;

    Int32 nResult = Foo(
        buf,
        bufSize
    );

    if nResult == FAILURE
        throw ...

    // Verify return size is valid
    if (UInt64)bufSize > int.MaxValue   // .NET array size type is 'int'
        throw ...

    buf.resize((int)bufSize);
}

STRUCTURES

To interop with structures containing size_t (and even in general), I followed a similar approach as with functions: create a .NET structure ("Interop Structure") which most closely resembles the native-code structure, and then put a .NET-friendly facade around it. The facade then does value checking as appropriate.

The specific implementation approach I took for the facade was to setup each field as a property with the Interop Structure as the backing store. Here's a small example:

C Structure:

struct Bar
{
    MyEnum e;
    size_t s;
}

C# (pseudo-code):

public class Bar
{
    // Optional c'tor if param(s) are required to be initialized for typical use

    // Accessor for e
    public MyEnum e
    {
        get
        {
            return m_BarInterop.e;
        }
        set
        {
            m_BarInterop.e = value;
        }
    }

    // Accessor for s
    public uint s
    {
        get
        {
            VerifyUIntPtrFitsInUint(m_BarInterop.s);   // will throw an exception if value out of range
            return (uint)m_BarInterop.s;
        }
        set
        {
            // uint will always fit in UIntPtr
            m_BarInterop.s = (UIntPtr)value;
        }
    }

    // Interop-compatible 'Bar' structure (not required to be inner struct)
    [System.Runtime.InteropServices.StructLayoutAttribute(System.Runtime.InteropServices.LayoutKind.Sequential)]
    internal struct Bar_Interop
    {
        public MyEnum e;
        public System.UIntPtr s;
    }

    // Instance of interop-compatible 'Bar' structure
    internal Bar_Interop m_BarInterop;
}

While a bit tedious at times, I've found that after taking this approach for only 2 structures so far it's yielded great flexibility and a clean API being exposed to consumers of my .NET wrapper.

Suggestions for interop'ing with size_t via PInvoke

2 Answers2