2

I have a stl map container filled with pairs of vcl UnicodeString objects. I'm trying to dump it to file with the code quoted below but instead of my strings I'm getting a file full of hex addresses.

//---------------------------------------------------------------------------

#include <vcl.h>
#pragma hdrstop
#include <tchar.h>
#include <iostream>
#include <fstream>
#include <map>

//---------------------------------------------------------------------------
WINAPI _tWinMain(HINSTANCE, HINSTANCE, LPTSTR, int)
{
      std::map<UnicodeString, UnicodeString> fm;
      fm[U"a"]=U"test";
      fm[U"b"]=U"test2";
      fm[U"c"]=U"test3";
      fm[U"z"]=U"last one";
      ofstream out("c:\\temp\\fm.txt");
      std::map<UnicodeString, UnicodeString>::const_iterator itr;
      for (itr = fm.begin(); itr != fm.end(); ++itr) {
          out << itr->first.c_str()<< ",\t\t"<< itr->second.c_str()<<std::endl;
      }

      out.close();

   return 0;
}

yields this:

1f3b624,                1f5137c
1f3b654,                1f513bc
1f3b66c,                1f513fc
1f3b684,                1f258dc

I've tried various ways of casting the c string but nothing seems to work.

marcp
  • 1,179
  • 2
  • 15
  • 36
  • What is the type of `UnicodeString.c_str()` – Collin Dauphinee May 09 '13 at 18:15
  • @dauphic: `wchar_t*` on Windows, `char16_t*` on other platforms. The OP's code really should be using the `L` prefix instead of the `U` prefix for `UnicodeString` literals, at least on Windows. `UnicodeString` does have a constructor that accepts `char32_t*` input, though, which the `U` prefix produces. – Remy Lebeau May 09 '13 at 21:54

3 Answers3

3

As usual the answer is quite straightforward, I was lead to it by @Dauphic's comment. I was using a 'narrow stream'. The solution is to use a wide stream, which I was surprised to discover exists!

The solution is to change the stream declaration to:

std::wofstream out("c:\\temp\\fm.txt");

and presto changeo it works.

The solution is also found here

Community
  • 1
  • 1
marcp
  • 1,179
  • 2
  • 15
  • 36
1

The problem is that you're trying to output a const char32_t* to a narrow stream; this type of stream only expects narrow strings (char*). Output of this type of string isn't supported by narrow streams.

The closest match to operator<<(const char32_t*) is operator<<(void*), which outputs the address given.

You'll need to create an overload of operator<<(basic_ostream&, const char32_t*) that converts the array into something that can be output to a narrow stream.

Note that you will have to jump through hoops if you want to output to a human readable text file; 4-byte character encodings are non-standard for Windows, and the native API doesn't provide any functionality for dealing with them.

Collin Dauphinee
  • 13,664
  • 1
  • 40
  • 71
  • I don't think it's `const unsigned*`... isn't it `const char32_t*`? – Mooing Duck May 09 '13 at 18:39
  • He's creating the strings with `U"foo"` – Collin Dauphinee May 09 '13 at 18:41
  • `§ 2.14.5/10 A string literal that begins with U, such as U"asdf", is a char32_t string literal. A char32_t string literal has type “array of n const char32_t”, where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters.` – Mooing Duck May 09 '13 at 18:42
  • Oh, yes. I thought he was compiling with Visual Studio for some reason. You're right, it's `char32_t` – Collin Dauphinee May 09 '13 at 18:43
  • Oh, yeah, in Visual Studio `char32_t` is a `typedef` for `unsigned`. But you should still use `char32_t` for this case. – Mooing Duck May 09 '13 at 18:44
  • 1
    Visual Studio doesn't support the `U` prefix, unless I'm out of date, so I was under the impression it was doing some non-standard magic with the `U`. – Collin Dauphinee May 09 '13 at 18:48
  • The C++Builder documentation states that the .c_str() function returns a wchar_t*. Generally speaking I can ussually call c_str() on any vcl String (which resolves to UnicodeString) to pass it to everything non-vcl. – marcp May 09 '13 at 20:01
  • @marcp If `wchar_t` is 4 bytes for C++Builder, then just change your `ofstream` to a `wofstream`, which accepts `wchar_t` instead of `char` – Collin Dauphinee May 09 '13 at 21:09
  • Right, @dauphic, thats what I've posted above. Thanks – marcp May 09 '13 at 21:15
  • 1
    @dauphic: The VCL `UnicodeString` type operates on UTF-16 strings exclusively. It uses `wchar_t` on platforms where `wchar_t` is 2 bytes, like Windows, and it uses `char16_t` on other platforms. `UnicodeString` has a constructor that accepts UTF-32 data as input via `char32_t` (which the `U` prefix uses), and will convert that data to UTF-16. – Remy Lebeau May 09 '13 at 21:47
0

Use

F << AnsiString(S).c_str() << endl;

where ofstream F;

and S is UnicodeString

Lotfi
  • 1