93

Example:

#include <iostream>

using namespace std;

int main()
{
    wchar_t en[] = L"Hello";
    wchar_t ru[] = L"Привет"; //Russian language
    cout << ru
         << endl
         << en;
    return 0;
}

This code only prints HEX-values like adress. How to print the wchar_t string?

zed91
  • 1,073
  • 1
  • 8
  • 10
  • The very first Related question is http://stackoverflow.com/questions/1625531/c-wchar-to-stdcout-and-comparision –  Mar 22 '10 at 16:12
  • 1
    On what OS, and using what console app? Some consoles don't support Unicode. – nobody Mar 22 '10 at 16:22
  • Thank you. I was writing a VC++ console app that printed back the command arguments and the output made me cringe. – James Ko Jul 18 '15 at 04:13

8 Answers8

111

Edit: This doesn’t work if you are trying to write text that cannot be represented in your default locale. :-(

Use std::wcout instead of std::cout.

wcout << ru << endl << en;
Nate
  • 18,752
  • 8
  • 48
  • 54
  • 4
    It prints only english string. What about russian? – zed91 Mar 22 '10 at 16:14
  • 11
    The console is not going to be Unicode enabled. Output redirection is the hangup, that's stuck in 8-bit char legacy. It can only output correct text on a Russian machine with the correct console font loaded. – Hans Passant Mar 22 '10 at 17:19
  • 1
    Note that if you try this to print to a Linux console you are likely to end up with garbled characters as most Linux systems does not use the utf16 encoding. – Björn Lindqvist Feb 11 '14 at 17:46
  • 1
    So what is `wcout` for when it is not working with UNICODE chars? – hfrmobile Apr 17 '23 at 14:24
20

Can I suggest std::wcout ?

So, something like this:

std::cout << "ASCII and ANSI" << std::endl;
std::wcout << L"INSERT MULTIBYTE WCHAR* HERE" << std::endl;

You might find more information in a related question here.

Community
  • 1
  • 1
Konrad
  • 39,751
  • 32
  • 78
  • 114
7

You cannot portably print wide strings using standard C++ facilities.

Instead you can use the open-source {fmt} library to portably print Unicode text. For example (https://godbolt.org/z/nccb6j):

#include <fmt/core.h>

int main() {
  const char en[] = "Hello";
  const char ru[] = "Привет";
  fmt::print("{}\n{}\n", ru, en);
}

prints

Привет
Hello

This requires compiling with the /utf-8 compiler option in MSVC.

For comparison, writing to wcout on Linux:

wchar_t en[] = L"Hello";
wchar_t ru[] = L"Привет";
std::wcout << ru << std::endl << en;

may transliterate the Russian text into Latin (https://godbolt.org/z/za5zP8):

Privet
Hello

This particular issue can be fixed by switching to a locale that uses UTF-8 but a similar problem exists on Windows that cannot be fixed just with standard facilities.

Disclaimer: I'm the author of {fmt}.

vitaut
  • 49,672
  • 25
  • 199
  • 336
2
#include <iostream>
using namespace std;
void main()
{
setlocale(LC_ALL, "Russian");
cout << "\tДОБРО ПОЖАЛОВАТЬ В КИНО!\n";
}
  • 2
    You could improve this answer by supplying an explanation to go with your code. – James Elderfield Aug 11 '16 at 12:09
  • 1
    Welcome to Stack Overflow! Although this code may help to solve the problem, it doesn't explain _why_ and/or _how_ it answers the question. Providing this additional context would significantly improve its long-term value. Please [edit] your answer to add explanation, including what limitations and assumptions apply. – Toby Speight Aug 11 '16 at 16:27
1

Windows has the very confusing information. You should learn C/C++ concept from Unix/Linux before programming in Windows.

wchar_t stores character in UTF-16 which is a fixed 16-bit memory size called wide character but wprintf() or wcout() will never print non-english wide characters correctly because no console will output in UTF-16. Windows will output in current locale while unix/linux will output in UTF-8, all are multi-byte. So you have to convert wide characters to multi-byte before printing. The unix command wcstombs() doesn't work on Windows, use WideCharToMultiByte() instead.

First you need to convert file to UTF-8 using notepad or other editor. Then install font in command prompt console so that it can read/write in your language and change code page in console to UTF-8 to display correctly by typing in the command prompt "chcp 65001" while cygwin is already default to UTF-8. Here is what I did in Thai.

#include <windows.h>
#include <stdio.h>

int main()
{
    wchar_t* in=L"ทดสอบ"; // thai language
    char* out=(char *)malloc(15);
    WideCharToMultiByte(874, 0, in, 15, out, 15, NULL, NULL);
    printf(out); // result is correctly in Thai although not neat
}

Note that 874=(Thai) code page in the operating system, 15=size of string

My suggestion is to avoid printing non-english wide characters to console unless necessary because it is not easy.

Ray Chakrit
  • 404
  • 5
  • 7
0

The way to do it is to convert UTF-16 LE (Default Windows encoding) into UTF-8, and then print to console (chcp 65001 first, to switch codepage to UTF-8).

It's pretty trivial to convert UTF-16 to UTF-8. Use this page as a guide, if you need more than 2 byte characters.

short* cmd_s = (short*)cmd;
while(cmd_s[i] != 0)
{
    short u16 = cmd_s[i++];
    if(u16 > 0x7F)
    {
        unsigned char c0 = ((char)u16 & 0x3F) | 0x80; // Least significant
        unsigned char c1 = char(((u16 >> 6) & 0x1F) | 0xC0); // Most significant
        cout << c1 << c0; // Use Big-endian network order
    }
    else
    {
        unsigned char c0 = (char)u16;
        cout << c0;
    }
}

Of course, you can put it in a function and extend it to handle wider characters (For Cyrillic it should be enough), but I wanted to show basic algorithm, and to prove that it's not hard at all and you don't need any libraries, just a few lines of code.

ScienceDiscoverer
  • 205
  • 1
  • 3
  • 13
-1

You could use use a normal char array that is actually filled with utf-8 characters. This should allow mixing characters across languages.

Michael Speer
  • 4,656
  • 2
  • 19
  • 10
-1

You can print wide characters with wprintf.

#include <iostream>

int main()
{
    wchar_t en[] = L"Hello";
    wchar_t ru[] = L"Привет"; //Russian language
    wprintf(en);
    wprintf(ru);
    return 0;
}

Output:

Hello
Привет

Stevoisiak
  • 23,794
  • 27
  • 122
  • 225