0

Consider:

#include <iostream>
#include <string>
#include <cstdlib>

using namespace std;

int main()
{
    wstring str = L"こんにちは";
    wcout << str << endl;
    system("pause");
}

I am trying to print Japanese (hello ) from a C++ program, but I am getting an error. I have saved this program in Notepad using Unicode encoding and then compiled it using MinGW 4.7.2, but I get the following error:

cd "E:\GCC test"
g++ -c unicode.cpp

Output:

unicode.cpp:1:1: error: stray '\377' in program
unicode.cpp:1:1: error: stray '\376' in program
unicode.cpp:1:1: error: stray '#' in program
unicode.cpp:3:4: error: invalid preprocessing directive #i
unicode.cpp:5:4: error: invalid preprocessing directive #i
unicode.cpp:1:5: error: 'i' does not name a type
unicode.cpp:11:2: error: 'i' does not name a type
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Wasim
  • 29
  • 1
  • 10
  • 4
    **1** ‘*an error*’, eh? *What* error?! **2** Unicode is not an encoding. – Biffen Mar 21 '16 at 08:03
  • E:\GCC test>g++ -c unicode.cpp unicode.cpp:1:1: error: stray '\377' in program unicode.cpp:1:1: error: stray '\376' in program unicode.cpp:1:1: error: stray '#' in program unicode.cpp:3:4: error: invalid preprocessing directive #i unicode.cpp:5:4: error: invalid preprocessing directive #i unicode.cpp:1:5: error: 'i' does not name a type unicode.cpp:11:2: error: 'i' does not name a type – Wasim Mar 21 '16 at 08:12
  • 1
    @Biffens: if unicode is not an encoding then what is option we get while saving a text file on windows in NOTEPAD under encoding, it says :ANSI, UNICODE,UNICODE BIG ENDIAN and UTF-8?? – Wasim Mar 21 '16 at 08:24
  • Have you tried UTF-8 (the best encoding there is (if you ask me))? – Biffen Mar 21 '16 at 08:27
  • ok i tried with UTF-8 , it is getting complied and linked but nothing is getting printed to console – Wasim Mar 21 '16 at 09:00
  • I'd say that warrants a new question. – Biffen Mar 21 '16 at 09:01
  • You can't portably embed unicode in your program source. You have to use universal character constants. – M.M Mar 21 '16 at 09:15
  • 1
    @M.M `u8""` and friends are supposed to solve that, but I've come across at least one compiler that didn't conform. – Biffen Mar 21 '16 at 09:24
  • Related (duplicates?): *[How can I make a char string from a C macro's value?](https://stackoverflow.com/questions/195975)*, *[Macros to create strings in C](https://stackoverflow.com/questions/798221/)*, and *[Convert a preprocessor token to a string](https://stackoverflow.com/questions/240353/)*. Se also "Linked" for those, e.g. [this](https://stackoverflow.com/questions/linked/195975) and [this](https://stackoverflow.com/questions/linked/240353). – Peter Mortensen May 21 '23 at 09:49

1 Answers1

1

By the errors it looks like you've got a file in UTF-16LE with a BOM (Byte Order Mark), and that the compiler doesn't like that.

\377\376 = 0xfffe = a UTF-16LE BOM

Try removing the BOM, and/or try a different encoding. UTF-8 is an excellent encoding that doesn't need a BOM, and that most compilers and a lot of other tools will understand.


As for Unicode, it is not a binary character encoding. There are a few encodings that are ‘tied to’ Unicode, however. UTF-8 and UTF-16 are probably the most common such encodings.

If an editor offers to save a file in ‘Unicode encoding’, then try to stay away from that editor. If that editor is Notepad, then there are more reasons to stay away from it. Get yourself a proper editor for programming, one that understands encodings and EOLs, and that has syntax highlight, etc.

Biffen
  • 6,249
  • 6
  • 28
  • 36
  • i have tried with UTF-8 it got compiled and created an executable but nothing is getting printed to console. – Wasim Mar 21 '16 at 09:08
  • 1
    Once again; That seems like an entirely different issues, so post a new question. – Biffen Mar 21 '16 at 09:10
  • @Vickey Nothing or do you see unintelligible characters getting printed? If it's the latter, then your terminal's font doesn't have glyphs for Japanese characters. Try pasting the Japanese "hello" directly in the prompt e.g. `> echo 'こんにちは'` and see if it prints fine. – legends2k Mar 21 '16 at 09:11
  • 1
    you have to jump through hoops to get Windows console applications to display unicode. You have to set codepage in the console, select a monospace font with the right glyphs (by editing the registry), and do the right thing in the program source. (For GUI programs it is easy though, Windows API supports unicode) – M.M Mar 21 '16 at 09:17
  • @legends2k: my console is not recognizing こんにちは these japanese character when i do echo こんにちは....what to do??? help ! – Wasim Mar 21 '16 at 09:21
  • @M.M I'm missing the links right now (searching...), but all that's not enough, because it leads to nasty bugs, with functions returning the amount of data written/read. Bug-free is just impossible. edit: Eg. https://connect.microsoft.com/VisualStudio/feedback/details/543801/unicode-issues-with-writefile-and-in-the-crt . Like the status says, the won't solve it (somewhere else I read because it's too much work), and WriteFIle is used to build CRT functions too... – deviantfan Mar 21 '16 at 09:21