-1
std::wstring arabic=L"الحانة";  
std::wstring english=L"english language";   
logger->log(NORMAL,L"abcdefgh");
logger->log(NORMAL,&arabic[0]);
logger->log(NORMAL,&english[0]);

I'm getting "abcdefgh,?????,english language" in my log file. But I'm expecting the arabic characters to be printed. I'm using visual studio 2005. Help please.

Blacktempel
  • 3,935
  • 3
  • 29
  • 53
mani
  • 21
  • 3
  • Nobody knows what that logger does. – Alexander V Dec 02 '15 at 07:07
  • You should show the code where the logger - whatever that may be - writes to the file. Maybe you write to an ANSI encoded file ? – Blacktempel Dec 02 '15 at 07:21
  • It's not formally guaranteed that the string buffer is zero-terminated. In practice I think that's pretty much implied by the C++11 guarantee that also for non-`const` string `s[n]` where `n` is the length, will be 0. But it would be much more clear and removing that doubt, to pass the `wstring` to the logger, and not just a pointer to its start. – Cheers and hth. - Alf Dec 02 '15 at 07:22
  • Keep in mind that the default action of wide streams such as `wcout` is to convert between internal wide strings and external narrow strings. – Cheers and hth. - Alf Dec 02 '15 at 07:24
  • logger->log just prints the wstring into log file..I have added some more code to show it. The logger->log function is too big thats why I didnt add it. Still if you want it I'll post it also – mani Dec 02 '15 at 07:53
  • http://stackoverflow.com/help/mcve – n. m. could be an AI Dec 02 '15 at 08:00
  • @Blacktempel yes. The file i write was ansi encoded. Then I tried changing it to UTF 8 still got the same thing. And I changed it to Unicode, there was a complete mess with chinese characters. – mani Dec 02 '15 at 08:05
  • Without having your attempts of the log function, in which you do the actual writing to the file, we can only take wild guesses which of the x possible ways you might have tried. Post your attempts with code. The code in your question (as for now) **shows absolutely nothing.** As mentioned by @n.m. post a [MVCE](http://stackoverflow.com/help/mcve). – Blacktempel Dec 02 '15 at 08:19
  • We need the way used by `logger::log` to do the actual writing. Is it `operator <<` of a binary write or ... and what is written a wstring or a wchar* ? – Serge Ballesta Dec 02 '15 at 09:28

2 Answers2

2

You need to do two things.

  1. Make sure your source file is UTF-8 with BOM.
  2. Call either _setmode(filedescriptor, _O_U16TEXT); or _setmode(filedescriptor, _O_U8TEXT); before doing any output.

The choice of mode depends on whether you want UTF-8 or UTF-16 output. Most of the time you want UTF-8 if you are writing to a disk file, and UTF-16 if you are writing to the console. Why, isn't this system beautiful?

To obtain the file descriptor for wfstream yourstream, use yourstream.fd(). To obtain the file descriptor for stdout, use _fileno(stdout).

The console may or may not support Arabic. See here for more info. You should always be able to write to a file though.

You need to include additional headers:

#include <io.h>
#include <fcntl.h>

Note, this is specific to the Microsoft compiler.

Edit: added the discussion of different modes.

Community
  • 1
  • 1
n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Rather then resort to `_setmode()`, you can instead `imbue()` a UTF-8 `locale` onto the `wfstream`. – Remy Lebeau Dec 02 '15 at 19:50
  • There are plenty of online examples/tutorials that show how to use the [`imbue()`](http://en.cppreference.com/w/cpp/io/basic_ios/imbue) method. – Remy Lebeau Dec 02 '15 at 22:35
  • @RemyLebeau Have you tried any of those with any Microsoft compiler on any Windows OS? Please share. How many of your Windows machines have en-US.UTF-8 installed? – n. m. could be an AI Dec 03 '15 at 03:17
0

There may be different problems here.

When writing a file in ansi mode, the library tries to convert the unicode string to narrows characters. If it is not possible, it causes an error and nothing more is written to the stream until you clear the error condition.

As you can see ???... in your file, I assume that it is not your current problem - unless logger.log does not use operator <<

But there can be another problem: even if the file was correctly written, the file editor used to display it may have problems with non ASCII characters. To be sure of it, you should examine an ascii dump of the file. If the characters displayed as ? have an ASCII code of 0x3f the problem was indeed at write time. But if they have ASCII codes greater than 127, then it is just a display problem.

vim is a multi-platform editor that allows to convert a file in hexa to see the ASCII dump.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252