1

I'm a little confused about C strings and wide C strings. For the sake of this question, assume that I using Microsoft Visual Studio 2010 Professional. Please let me know if any of my information is incorrect.

I have a struct with a const wchar_t* member which is used to store a name.

struct A
{
    const wchar_t* name;
};

When I assign object 'a' a name as so:

int main()
{
    A a;

    const wchar_t* w_name = L"Tom";
    a.name = w_name;

    return 0;
}

That is just copying the memory address that w_name points to into a.name. Now w_name and a.name are both wide character pointers which point to the same address in memory.

If I am correct, then I am wondering what to do about a situation like this. I am reading in a C string from an XML attribute using tinyxml2.

tinyxml2::XMLElement* pElement;
// ...
const char* name = pElement->Attribute("name");

After I have my C string, I am converting it to a wide character string as follows:

size_t newsize = strlen(name) + 1;
wchar_t * wcName = new wchar_t[newsize];
size_t convertedChars = 0;
mbstowcs_s(&convertedChars, wcName, newsize, name, _TRUNCATE);

a.name = wcName;

delete[] wcName;

If I am correct so far, then the line:

 a.name = wcName;

is just copying the memory address of the first character of array wcName into a.name. However, I am deleting wcName directly after assigning this pointer which would make it point to garbage.

How can I convert my C string into a wide character C string and then assign it to a.name?

user974967
  • 2,928
  • 10
  • 28
  • 45
  • Why are you deleting wcName? If the point is to convert to wchar_t, then why are you deleting the newly-allocated memory holding the result of your conversion? – atkretsch Nov 15 '12 at 20:04
  • The code as posted is problematic. There doesn't seem to be an ideal place to delete it. I thought about adding an ~A() destructor to delete it, but that seems really ugly to me for some reason. – user974967 Nov 15 '12 at 20:08
  • @user974967 That (the destructor) is *precisely* where you would delete it (or anytime you need to reassign it, but then I would advise trying to reuse the buffer if the new assignment still fits. – WhozCraig Nov 15 '12 at 20:09

3 Answers3

3

The easiest approach is probably to task you name variable with the management of the memory. This, in turn, is easily done by declaring it as

std::wstring name;

These guys don't have a concept of independent content and object mutation, i.e., you can't really make the individual characters const and making the entire object const would prevent it from being assigned to.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • But how is this going to get his conversion done? Granted it would save him from having to manage the out-of-scope deletion of his conversion buffer, which is currently deleted incorrectly after saving the address. Or did the std lib add a `char*` to `std::wstring` up-conversion and I missed it (wouldn't be the first time I was sleeping at the wheel). – WhozCraig Nov 15 '12 at 20:12
  • It seems he is already doing the conversion using `mbstowc()` (well the "safe" abomination `mbstowcs_s()` which a certain vendor tries to impose). Whether this deals correctly with whatever encoding is in the XML file, I don't know, but it seems that the question wasn't about the conversion. – Dietmar Kühl Nov 15 '12 at 20:26
  • @WhozCraig - using a `wstring` will make `a.name = wcName;` work correctly, and the temp buffer can be deleted afterwards without problems. – Bo Persson Nov 15 '12 at 21:09
  • @BoPersson or use the std::wstring *as* the buffer itself. (see [answer](http://stackoverflow.com/questions/13405176/c-string-to-wide-c-string-assignment/13406143#13406143)) – WhozCraig Nov 15 '12 at 21:11
1

You can do this while using a std::wstring without relying on the additional temporary conversion buffer allocation and destruction. Not tremendously important unless you're overtly concerned about heap fragmentation or on a limited system (aka Windows Phone). It just takes a little setup on the front side. Let the standard library manage the memory for you (with a little nudge).

class A
{
   ...
   std::wstring a;
};


// Convert the string (I'm assuming it is UTF8) to wide char
int wlen = MultiByteToWideChar(CP_UTF8, 0, name, -1, NULL, NULL);
if (wlen > 0)
{
    // reserve space. std::wstring gives us the terminator slot
    // for free, so don't include that. MB2WC above returns the
    // length *including* the terminator.
    a.resize(wlen-1);
    MultiByteToWideChar(CP_UTF8, 0, name, -1, &a[0], wlen);
}
else
{   // no conversion available/possible.
    a.clear();
}

On a complete side-note, you can build TinyXML to use the standard library and std::string rather than char *, which doesn't really help you much here, but may save you a ton of future strlen() calls later on.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
0

As you correctly mentioned a.name is just a pointer which doesn't suppose any allocated string storage. You must manage it manually using new or static/scoped array.

To get rid of these boring things just use one of available string classes: CStringW from ATL (easy to use but MS-specific) or std::wstring from STL (C++ standard, but not so easy to convert from char*):

#include <atlstr.h>

// Conversion ANSI -> Wide is automatic
const CStringW name(pElement->Attribute("name"));    

Unfortunately, std::wstring usage with char* is not so easy. See conversion functon here: How to convert std::string to LPCWSTR in C++ (Unicode)

Community
  • 1
  • 1
Rost
  • 8,779
  • 28
  • 50