1

I've created a class to test some functionality I need to use. Essentially the class will take a deep copy of the passed in string and make it available via a getter. Am using Visual Studio 2012. Unicode is enabled in the project settings.

The problem is that the memcpy operation is yielding a truncated string. Output is like so;

THISISATEST: InstanceDataConstructor: Testing testing 123
Testing te_READY

where the first line is the check of the passed in TCHAR* string & the second line is the output from populating the allocated memory with the memcpy operation. Output expected is; "Testing testing 123".

Can anyone explain what is wrong here?

N.B. Got the #ifndef UNICODE typedefs from here: how-to-convert-tchar-array-to-stdstring

#ifndef INSTANCE_DATA_H//if not defined already
#define INSTANCE_DATA_H//then define it

#include <string>

//TCHAR is just a typedef, that depending on your compilation configuration, either defaults to char or wchar.
//Standard Template Library supports both ASCII (with std::string) and wide character sets (with std::wstring).
//All you need to do is to typedef String as either std::string or std::wstring depending on your compilation configuration.
//To maintain flexibility you can use the following code:
#ifndef UNICODE  
  typedef std::string String; 
#else
  typedef std::wstring String; 
#endif
//Now you may use String in your code and let the compiler handle the nasty parts. String will now have constructors that lets you convert TCHAR to std::string or std::wstring.


class InstanceData
{
public: 
    InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor     
        {
        //Check to passed in string
        String outMsg(L"THISISATEST: InstanceDataConstructor: ");//L for wide character string literal
        outMsg += strMessage;//concatenate message
        const wchar_t* finalMsg = outMsg.c_str();//prepare for outputting
        OutputDebugStringW(finalMsg);//print the message    

        //Prepare TCHAR dynamic array.  Deep copy.
        charArrayPtr = new TCHAR[strMessage.size() +1];
        charArrayPtr[strMessage.size()] = 0;//null terminate
        std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());//copy characters from array pointed to by the passed in TCHAR*.

        OutputDebugStringW(charArrayPtr);//print the copied message to check    
        }

    ~InstanceData()//destructor
        {
            delete[] charArrayPtr;
        }

//Getter
TCHAR* getMessage() const
{
    return charArrayPtr;
}

private:
    TCHAR* charArrayPtr;
    String strMessage;//is used to conveniently ascertain the length of the passed in underlying TCHAR array.
};
#endif//header guard
Community
  • 1
  • 1
GoFaster
  • 835
  • 1
  • 12
  • 23
  • use `String` instead of `TCHAR*` for `charArrayPtr` – M.M Nov 11 '14 at 10:19
  • 4
    your `memcpy` is too short, the third argument shoudl be the number of bytes but you are passing the number of characters. Use `std::copy` isntead. – M.M Nov 11 '14 at 10:20
  • 2
    Or maybe `std::memcpy(charArrayPtr, strMessage.data(), strMessage.size() * sizeof(TCHAR))` – Roger Rowland Nov 11 '14 at 10:20
  • Why not just `std::vector` or your `String` type, and leave the dynamic memory stuff alone? – PaulMcKenzie Nov 11 '14 at 10:31
  • @Roger Rowland That worked well; nice simple fix! Thanks. – GoFaster Nov 11 '14 at 10:33
  • @GoFaster - Try and make a copy of your `InstanceData` object `{InstanceData d1(_T"abc"); InstanceData d2 = d1;}` That 2 line program breaks your class. – PaulMcKenzie Nov 11 '14 at 10:36
  • @PaulMcKenzie the TCHAR* return type in the getter is necessary to pass back to a third party library. Hence the shenanigans with dynamic memory. – GoFaster Nov 11 '14 at 10:36
  • 1
    @GoFaster - There is a `c_str()` function that takes care of all of that. There is no need to do any of the dynamic memory allocation. A string class would be worth nothing if it didn't have a method of returning a C-style string. – PaulMcKenzie Nov 11 '14 at 10:37
  • @PaulMcKenzie Well said; glad someone's thinking straight. Is a tidier solution. Used: (TCHAR*)strMessage.c_str(). Thanks. – GoFaster Nov 11 '14 at 10:55

2 Answers2

2

A solution without all of the dynamically allocated memory.

#include <tchar.h>
#include <vector>
//...
class InstanceData
{
    public: 
        InstanceData(TCHAR* strIn) : strMessage(strIn),
        {   
            charArrayPtr.insert(charArrayPtr.begin(), strMessage.begin(), strMessage.end())
            charArrayPtr.push_back(0);   
        }

        TCHAR* getMessage()
        { return &charArrayPtr[0]; }

    private:
        String strMessage;
        std::vector<TCHAR> charArrayPtr;
};

This does what your class does, but the major difference being that it does not do any hand-rolled dynamic allocation code. The class is also safely copyable, unlike the code with the dynamic allocation (lacked a user-defined copy constructor and assignment operator).

The std::vector class has superseded having to do new[]/delete[] in almost all circumstances. The reason being that vector stores its data in contiguous memory, no different than calling new[].

PaulMcKenzie
  • 34,698
  • 4
  • 24
  • 45
1

Please pay attention to the following lines in your code:

// Prepare TCHAR dynamic array.  Deep copy.
charArrayPtr = new TCHAR[strMessage.size() + 1];
charArrayPtr[strMessage.size()] = 0; // null terminate

// Copy characters from array pointed to by the passed in TCHAR*.
std::memcpy(charArrayPtr, strMessage.data(), strMessage.size());

The third argument to pass to memcpy() is the count of bytes to copy.
If the string is a simple ASCII string stored in std::string, then the count of bytes is the same of the count of ASCII characters.

But, if the string is a wchar_t Unicode UTF-16 string, then each wchar_t occupies 2 bytes in Visual C++ (with GCC things are different, but this is a Windows Win32/C++ code compiled with VC++, so let's just focus on VC++).
So, you have to properly scale the size count for memcpy(), considering the proper size of a wchar_t, e.g.:

memcpy(charArrayPtr, strMessage.data(), strMessage.size() * sizeof(TCHAR));

So, if you compile in Unicode (UTF-16) mode, then TCHAR is expanded to wchar_t, and sizeof(wchar_t) is 2, so the content of your original string should be properly deep-copied.

As an alternative, for Unicode UTF-16 strings in VC++ you may use also wmemcpy(), which considers wchar_t as its "unit of copy". So, in this case, you don't have to scale the size factor by sizeof(wchar_t).


As a side note, in your constructor you have:

InstanceData(TCHAR* strIn) : strMessage(strIn)//constructor     

Since strIn is an input string parameter, consider passing it by const pointer, i.e.:

InstanceData(const TCHAR* strIn)
Mr.C64
  • 41,637
  • 14
  • 86
  • 162