5

I am working on a C++ DLL with a C wrapper to be able to use it on different langages. For now, I am developing too a plugin in C# which call my DLL.

What I want is to pass as argument of my DLL a string (the path of a file) to be able to use it on my DLL.

C#

[DllImport(DllName, CallingConvention = DllCallingConvention)]
public static extern IntPtr AllocateHandle(string filename);

C wrapper

LPVOID SAMPLEDLL_API CALLCONV_API AllocateHandle(char* filename);

C++ class constructor

CustomData::CustomData(char* filename)
{
    _filename = filename; // string _filename;
}

When I save _filename on a file (because I didn't find the way to debug using breakpoints on the DLL), I have something like ÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌÌ0à×. I tried different solutions to convert a char* to a string but the result is still the same.

Thank you in advance for your help.

Mathieu Gauquelin
  • 601
  • 11
  • 35
  • Does https://stackoverflow.com/a/13993608/34092 help? – mjwills Mar 23 '18 at 10:48
  • I suppose you need to decorate your `filename` with `[MarshalAs(UnmanagedType.LPStr)]` – Evk Mar 23 '18 at 11:34
  • No for the first comment and sorry I discover the world of DLL, plugin, etc so MarshalAs means nothing for me for now, I prefer to try the first answer before this and it is working so thank you all the same – Mathieu Gauquelin Mar 23 '18 at 13:54

2 Answers2

2

The problem is that strings in C# are in Unicode. String in cpp is ansi string. You have to tell C# that the string must be ansi:

[DllImport(DllName.dll, CallingConvention=CallingConvention.StdCall, CharSet=CharSet.Ansi)]
static extern IntPtr AllocateHandle(string filename);

You could also pass string length as the second argument, so you could know what is the length of the string on cpp side.

[edit]

According to some comments you could also try to change [char *] to [wchar_t *] which is unicode. Then you should of course use approperiate attribute on C# side: CharSet=CharSet.Unicode

Adam Jachocki
  • 1,897
  • 1
  • 12
  • 28
  • 3
    Isn't the default `Ansi`? https://msdn.microsoft.com/en-us/library/system.runtime.interopservices.dllimportattribute.charset(v=vs.110).aspx – mjwills Mar 23 '18 at 10:54
  • The default dll import string marshaling is ANSI, I better advice to use `wchar_t` in the dll – Markiian Benovskyi Mar 23 '18 at 10:57
  • Ansi refers to variable lenght characters, i.e. UTF-8 – babu646 Mar 23 '18 at 10:59
  • No, ANSI is a simple string. One byte = one character. – Adam Jachocki Mar 23 '18 at 11:01
  • 2
    @Bigiansen Ansi is a 1-byte per-char and Unicode uses 2-bytes to represent chars in string, if op wants to use other than english characters, they should be in unicode – Markiian Benovskyi Mar 23 '18 at 11:01
  • From MSDN: Ansi = Marshal strings as multiple-byte character strings. https://msdn.microsoft.com/es-es/library/system.runtime.interopservices.charset(v=vs.110).aspx – babu646 Mar 23 '18 at 11:03
  • 3
    https://msdn.microsoft.com/en-us/library/4bb3e64h.aspx - this shows that only some languages support multibyte character strings. Such as Japanese or Chinese. Normally ANSI is one byte per character. Unicode charsets are wider. In unicode every character in every language is coded with multiple bytes. UTF8 can have from one up to 4 bytes per character. – Adam Jachocki Mar 23 '18 at 11:14
  • ANSI = There's no one fixed ANSI encoding - there are lots of them. Usually when people say "ANSI" they mean "the default locale/codepage for my system" which is obtained via Encoding.Default, and is often Windows-1252 but can be other locales. – babu646 Mar 23 '18 at 11:16
  • `CharSet.Ansi` tells the marshaller to marshal as ANSI unless otherwise instructed. Likewise `CharSet.Unicode` is an instruction to marshal as UTF-16 unless otherwise instructed. Since ANSI is a collection of 8-bit character sets, this is the problem why OP characters are not displayed correctly... refer here for similar question: https://stackoverflow.com/q/17808003/4697963 – Markiian Benovskyi Mar 23 '18 at 11:21
  • Ansi is 7-bit character set :) Extended Ansi is 8. – Adam Jachocki Mar 23 '18 at 11:26
  • 1
    Using `CharSet.Unicode` and `wchar_t` with [wchar_t * convert to string](https://stackoverflow.com/questions/27720553/conversion-of-wchar-t-to-string), it is working now thank you ;) – Mathieu Gauquelin Mar 23 '18 at 13:53
1

It looks like you are storing the string that is passed from managed code in a member field in the unmanaged class. That won't work, because the Garbage Collector will move or dispose of the managed string at some point, which will render the string useless on the unmanaged side. If you want to keep the string for later use, you have to make a copy of it on the unmanaged side (allocated on the unmanaged heap).

CustomData::CustomData(char *filename)
{
  // _filename will need to be freed at some point; might
  // want to think about using std::string instead
  // like _filename = new std::string (filename);
  _filename = strdup(filename);
}

Now the unmanaged code has its own (unmanaged) copy of the string, so when the GC disposes of the managed string, it won't matter. This is the simplest way to deal with the situation, since you are also writing the unmanaged code. There are other measures to prevent the GC from interfering with unmanaged interop, but those are tricky and time-consuming, and are necessary only if you can't modify the unmanaged code.

Mark Benningfield
  • 2,800
  • 9
  • 31
  • 31
  • It's even worse than your first paragraph describes, because the buffer used for interop can be garbage collected long before the managed string it is associated with. – Ben Voigt Mar 25 '18 at 15:27