4

I'm trying to do something as simple as this:

#include <iostream>
#include <string>
using namespace std;

int main()
{
    wstring nihongo = L"みんなのにほんご";
    wcout << nihongo << endl;
    return 0;
}

But I get the following errors:

C:\Users\Leonne\Leomedia\MetaDatterTest.cpp|7|error: stray '\201' in program|

C:\Users\Leonne\Leomedia\MetaDatterTest.cpp|7|error: stray '@' in program|

C:\Users\Leonne\Leomedia\MetaDatterTest.cpp||In function 'int main()':|

C:\Users\Leonne\Leomedia\MetaDatterTest.cpp|7|error: converting to execution character set: Illegal byte sequence|

||=== Build finished: 3 errors, 0 warnings ===|

I'm in a Windows machine and I am attempting to make a library that is as portable as possible, and it must be able to deal with any kind of characters: Russian, Japanese, ASCII, everything.

André Caron
  • 44,541
  • 12
  • 67
  • 125
Andy Ibanez
  • 12,104
  • 9
  • 65
  • 100
  • 4
    The compiler also has to read source files as unicode too, I don't know if that's the problem though. – Seth Carnegie Dec 17 '11 at 19:14
  • Even when you get this to compile, it's unlikely to fit your requirements of "as portable as possible". I assume that you want users to be able to actually see the text. And for that I recommend you get away from the console and use some kind of portable graphics library. – Benjamin Lindley Dec 17 '11 at 19:26
  • 5
    There's an invisible glyph in your statement before the L with character code '\x3000'. Copy/paste this to fix: `wstring nihongo = L"みんあのにほんご";` – Hans Passant Dec 17 '11 at 19:31
  • 1
    @Hans: That's an answer. – Benjamin Lindley Dec 17 '11 at 19:35
  • 1
    From your comment it seems you should read http://www.joelonsoftware.com/articles/Unicode.html –  Dec 17 '11 at 19:37
  • 1
    Meh, getting his compiler to understand the literal and getting this to render properly in a console or terminal window on a Western machine takes another couple of answers. We don't know the compiler nor the operating system. – Hans Passant Dec 17 '11 at 19:38
  • @delnan: Nice article! I am starting to understand this a lot more now. Still reading it, but it's pretty informative so far. Hans: That fixed all the errors except the last one. I believe I am in a good path though. – Andy Ibanez Dec 17 '11 at 21:56

2 Answers2

3

Visual Studio support unicode source files. Make sure that your cpp files are saved a utf16 or utf8 formatted files with a BOM. Once in that format your files will compile fine.

outis
  • 75,655
  • 22
  • 151
  • 221
Shane Powell
  • 13,698
  • 2
  • 49
  • 61
  • 1
    With a BOM? I'm using Code::Blocks for this. But anyways, what's a "BOM"? – Andy Ibanez Dec 17 '11 at 19:25
  • @Sergio, thats a 2byte identifier for unicode files. Open your source file in notepad, and do a 'save as' with an appropriate encoding you wish – Ulterior Dec 17 '11 at 19:32
  • wcout is kind-of braindead... Basically it narrow's your wstring then prints it, by default narrowing the characters just chops of the high byte. Check out this blog entry of printing UTF16 in windows http://blogs.msdn.com/b/michkap/archive/2008/03/18/8306597.aspx – Shane Powell Dec 18 '11 at 18:00
1

Check the first answer on this question:

std::wstring VS std::string

and my answer on this:

Handling UTF-8 in C++

I believe you will find there an answer to your question. Troubles with character coding are a bit confusing stuff and there is no simple answer...

Community
  • 1
  • 1
vitakot
  • 3,786
  • 4
  • 27
  • 59
  • 1
    It would be more appropriate to post this "answer" as a comment. Answers that simply [link elsewhere](http://meta.stackexchange.com/questions/8231/) are discouraged. – outis Dec 17 '11 at 19:53
  • OK, I didn't know that. One can see it is quite common practice on SO. OT: I often wonder why people ask such questions on SO when there is so much answers and even articles everywhere on the Internet... – vitakot Dec 17 '11 at 20:20
  • People are lazy, and not always in a [good way](http://c2.com/cgi/wiki?LazinessImpatienceHubris). Spread the word about posting practices on SO. – outis Dec 17 '11 at 21:23
  • Answer comment, this helped me understand this a bit better. Thanks for that. – Andy Ibanez Dec 17 '11 at 22:03
  • @outis I read elsewhere in SO comments that SO is meant to help google find answers on SO rather than asking people to "Just google it". Reading your comment now, I'm confused as to to what SO is there for. Just kidding though ;-) – GuruM Oct 18 '12 at 08:18