0

I am using the following code to convert a string from unsigned char* to const wchar_t* . The error I am getting is that only a few words are being converted properly while the rest is garbled value.

CODE

unsigned char* temp = fileUtils->getFileData("levels.json", "r", &size);
const char* temp1 = reinterpret_cast<const char*>(temp);
size_t len = mbstowcs(nullptr, &temp1[0], 0);
if (len == -1) {

} else {
    wchar_t* levelData = new wchar_t();
    mbstowcs(&levelData[0], &temp1[0], size*10);
}

OUTPUT

temp1 = "[{"scaleFactor": 1}][{"scaleFactor": 2}][{"scaleFactor": 3}][{"scaleFactor": 4}][{"scaleFactor": 5}][{"scaleFactor": 6}][{"scaleFactor": 7}][{"scaleFactor": 8}][{"scaleFactor": 9}][{"scaleFactor": 10}]"

levelData = "[{"scaleFactor": 1}][{"scaleFactor": 2}][{"scaleFactor": 3}][{"scaleFactor": 4}][{"scaleFactor": 5}][{"scaleFactor": 6}][{"scaleFactor": 7}][{"s慣敬慆瑣牯㨢㠠嵽筛猢慣敬慆瑣牯㨢㤠嵽筛猢慣敬慆瑣牯㨢ㄠ細ﵝ﷽꯽ꮫꮫꮫﺫﻮ"
asloob
  • 1,308
  • 20
  • 34

4 Answers4

2
wchar_t* levelData = new wchar_t();
mbstowcs(&levelData[0], &temp1[0], size*10);

That allocated enough memory for exactly ONE character. That's not enough to store your string, so of course things will not work right.

Also, where'd that 10 come from?

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • I should have probably mentioned that I am C++ novice. With `size*10` I was trying to set a large size so it holds all characters. – asloob May 17 '13 at 12:31
  • 2
    @Ben Voigt, it is a place for asking and learning, I believe, not a playground for our sarcasm. – Number47 May 17 '13 at 18:11
1

You don't need to hard code the buffer size if you're going to allocate it dynamically (with new).

wchar_t* levelData = new wchar_t[len+1];
mbstowcs(&levelData[0], &temp1[0], len);
Simon
  • 6,293
  • 2
  • 28
  • 34
  • Thanks for this, but I am still getting garbage values at the end. How do I get rid of that? Is it because of incorrect length? – asloob May 17 '13 at 12:47
  • 2
    `sizeof (levelData)` gives you the size of a pointer, which is definitely not right. -1 – Ben Voigt May 17 '13 at 16:00
  • Please learn more before answering the questions. `len` in this case will be always 0 or -1. – Number47 May 17 '13 at 18:01
  • 1
    @Number47 No, when using `nullptr` as the first parameter (dest pointer) you get the length of the converted string regardless of the last parameter (max characters to convert). So `len` does give the correct length. – Simon May 20 '13 at 08:09
  • @Simon, Oh, I did not know there is such behavior. Still, it is non-standard -- _POSIX specifies a common extension: if dst is a null pointer, this function returns the number of wide characters that would be written to dst, if converted._ -- Besides, both your and my solutions are probably wrong, while `temp` array probably is not followed by `0`. – Number47 May 20 '13 at 10:37
0

Thanks to @BenVoigt, found the mistake. Changed the code to this-

wchar_t levelData[200];
mbstowcs(&levelData[0], &temp1[0], size);
asloob
  • 1,308
  • 20
  • 34
0
unsigned char* temp = fileUtils->getFileData("levels.json", "r", &size);
const char* temp1 = reinterpret_cast<const char*>(temp);

wchar_t* levelData = new wchar_t[size];
int last_char_size = 0;

mbtowc(NULL, 0, 0);
for (wchar_t* position = levelData; size > 0; position++)
{
    last_char_size = mbtowc(position, temp1, size);
    if (last_char_size <= 0) break;
    else {
        temp1 += last_char_size;
        size -= last_char_size;
    }
}

if (last_char_size == -1)
{
    std::cout  << "Invalid encoding" << std::endl;
}

delete[] temp; // * probably

The marked line (*) depends on, whether the fileUtils->getFileData allocates a memory block for temp and the object of fileUtils does not manage it by its own. -- Which is most probable. However you should check the documentation.

The size should be perfectly enough size for the levelData array, while whithin [] you specify the number of elements of the array, not the number of bytes(aka chars). - In this case, it is the number of wide characters. Which can not be more, then read chars.

Another thing you should be aware, the fileUtils->getFileData probably reads binary date. So the text in temp is not followed by 0. So, later string functions - like wcstok - called on it, will shoot your foot off.

And one another. If you are not familiar with the construction

    function_on_arrays( target,  source,  size )

Remember your program in C/C++ don't know the sizes of target and source. But probably, you do not want to the function do something beyond them. So this is what for the size mainly is. - Your manual way to say, on how many elements you want to perform the action to not go beyond the arrays's data.

Edit: The earlier solution was wrong, as mistakenly treating the last parameter of mbstowcs as the number of characters in the source.

Number47
  • 493
  • 4
  • 14