0

I have the following native c++ function:

// Decode binary format from file 'filename' into stream 'output'
bool read_private_format(const char * filename, std::ostringstream & output);

Reading previous post on SO on StringBuilder and delegate, I have created an intermediate C function to be exposed to the C# layer as:

extern "C" {
  typedef char *(*StringBuilderCallback)(int len);
  __attribute__ ((visibility ("default")))
  bool c_read_private_format(const char * filename, StringBuilderCallback ensureCapacity, char *out, int len) {
    std::ostringstream oss;
    if( read_private_format(filename, oss) ) {
      const std::string str = oss.str();
      if( str.size() > len )
        out = ensureCapacity(str.size());
      strcpy(out, str.c_str());
      return true;
    }
    return false;
  }
}

while on the C# side:

private delegate System.Text.StringBuilder StringBuilderEnsureCapacity(int capacity);
[System.Runtime.InteropServices.DllImport(NativeLibraryName, EntryPoint="c_read_private_format")]
private static extern bool c_read_private_format(string filename, System.IntPtr aCallback, System.Text.StringBuilder data, int size);

private static System.Text.StringBuilder callback(int capacity)
{
    buffer.EnsureCapacity( capacity );
    return buffer;
}

public static string readIntoString(string filename) {
  StringBuilderEnsureCapacity del = new StringBuilderEnsureCapacity(callback);
  System.IntPtr ptr = System.Runtime.InteropServices.Marshal.GetFunctionPointerForDelegate(del)
  if( c_read_private_format( ptr, buffer, buffer.Capacity ) ) {
    string str = buffer.ToString();
    return str;
  }
  return null;
}

For some reason this is not working as expected, when printing the adress of the char* as returned by callback it acts as if the pointer returned was the one before the call to EnsureCapacity (I can verify by doing a second call, in which case the char* in the C layer is different).

My questions is:

  • How can I efficiently retrieve a UTF-8 string from C in .NET SDK (5.0.202) ?

I do not know in advance how long the string will be. Technically I could overestimate the StringBuilder Capacity so that I can re-use across my files, but it feels as if there could a better approach to passing a growing stream to the c layer.

malat
  • 12,152
  • 13
  • 89
  • 158
  • Allocate the memory in the callee and deallocate in the caller is probably best here. Either use a shared allocator, or export the native deallocate. That call back idea is clunky in my view. – David Heffernan Apr 15 '21 at 07:35
  • I do agree that the callback looks clunky, I do not however agree the memory callee/caller pattern is the best here. Why not implement a OFstream interface in C which map to a [StringStream C#](http://web.archive.org/web/20130414075835/http://www.codingday.com/string-stream-for-net/) – malat Apr 15 '21 at 07:54
  • It's obviously up to you how you do this. I tend to use low level C style interfaces for this sort of interop and wrap it up in a more idiomatic fashion in whichever language I am using to consume the native library. The point being that the pinvoke is just a stepping stone, and isn't seen by the higher level code. But do it however you please. – David Heffernan Apr 15 '21 at 13:03

1 Answers1

1

There is no point in trying to optimize the posted code since by definition the pinvoke layer is missing the most important point:

❌ AVOID StringBuilder parameters. StringBuilder marshaling always creates a native buffer copy. As such, it can be extremely inefficient.

malat
  • 12,152
  • 13
  • 89
  • 158