0

I'm having a problem returning a native string in the correct character-set. I convert from a string to a wstring to an LPCWSTR to pass back to managed. For the string to wide-string, the s2ws method produces a very small string return because it seems to stop at my first would-be terminator (in managed), which is ';'. So, before you mention s2ws, I've alreday tried it to no avail.

String stuff:

    char target[1024];
    sprintf_s(target, 1024, "%s %s%s%s",
            mac,
            " (",
            pWLanBssList->wlanBssEntries[t].dot11Ssid.ucSSID,
            ");");
    std::string targetString = std::string(target);
    targetWString.append(targetString.begin(), targetString.end());

Later string stuff:

std::wstring returnWString = L"";
returnWString.append(SomeMthod().c_str());
//wprintf_s(returnWString.c_str()); // Works - Data is in the string.
LPCWSTR returnLpcuwstr = returnWString.c_str();
return returnLpcuwstr;

How do I know it's a character set/encoding problem? Well, when the LPCWSTR is returned to managed and I use Marshal to Unicode string, I get a wall of null/empty characters. When I try it in ANSI, this is what I get (reduced in size/scale for readability):

ÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝ

The s2ws method is supposed to address the ANSI/UNICODE nightmare that is std::string->std::wstring but that makes the return far too short - far shorter than it should be - but doesn't address the actual charset problem.

Result (to ANSI, again - no reducing done on my part): ÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝÝ

When I check in native, wprintf_s shows me that the string is valid/good before the LPCWSTR conversion happens in the export method; so, I need to understand:

  1. Is there a way for me to tell what the byte size of the characters actually are? (I'm thinking that this is an 8byte versus 16byte scenario?)
  2. Since wprintf_s works on the wide string, I checked it against the LPCWSTR and it printed the same (expected) data; so, the issue doesn't appear to be in the .ctor() of the LPCWSTR. Yet, I want to double-check my maths: Am I LPCWSTR'ing correctly?
  3. Since everything in native is telling me that the string is good, how can I check it's character-set (in native)?

The return, itself, is about 8 lines of text, with a delimiter ';' used so that I can split the string in managed and do magic with it. The only issue is getting the string to render as a valid string in managed, with the correct characters in it.

I feel like, maybe, I'm missing something obvious here but I cannot figure out what it is and just need a fresh pair of eyes to tell me where and how I'm failing at life.

renholder
  • 25
  • 2

1 Answers1

0
LPCWSTR returnLpcuwstr = returnWString.c_str();
return returnLpcuwstr;

This is returning a pointer to data that gets freed immediately after the return, when returnWString goes out of scope. The returned pointer is invalid before the receiver can even use it. This is undefined behavior.

To do what you are attempting, you will have to return a pointer to dynamically allocated memory, and then the receiver will have to free that memory when done using it.

Assuming by "managed" you are referring to .NET, then .NET's marshaller frees unmanaged memory using CoTaskMemFree(), so if you are using default marshaling, the returned pointer must be pointing at memory that is allocated using CoTaskMemAlloc() or equivalent (SysAllocString...(), for instance).

Otherwise, if you are not using default marshaling (ie, you are calling Marshal.PtrToStringUni() manually instead), then you will have to make the .NET code pass the memory pointer back to your C++ code so it can then free the memory properly. Then your C++ code can allocate the memory however you want to (as long as it is still allocated dynamically so it can survive past the function return).

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • I'm passing from C++ to C#/.NET. To clarify: The code in C++ is trying to return a string that's "delimited" by ";" so I can string.Split() in .NET. The problem is that when Marshal.PtrToStringUni() or Marshal.PtrToStringAnsi() are used, it's junk data. If the lifetime of LPCWSTR ends when it leaves the scope of the extern, how can I "force" it to still exist long enough for Marshal to read it in via the IntPtr? – renholder Nov 27 '18 at 07:07
  • Nevermind. Your answer led me to what I had to do: I had to change the scope of the std::wstring to be at a higher level. I understand what the problem was now: LPCWSTR is just a pointer to the object passed into it. As you said, when scope is left, the object and pointers to the std::wstring are disposed of. So, the LPCWSTR is just returning data referenced by a pointer that not longer exists (and I suspect the data, as well). One thing that I don't understand, though, is why the LPCWSTR returns should a large wall of characters, as if it was receiving the correct pointer. – renholder Nov 27 '18 at 07:13
  • @renholder Indeed, once the `std::wstring` goes out of scope, its data is destroyed, and the `LPCWSTR` is left dangling, thus accessing the data is **undefined behavior**, anything can happen. You are seeing trash. You could just as easily crash the app instead. – Remy Lebeau Nov 27 '18 at 07:56