0

How can I "cout" in C++ inside Visual Studio Output Window, this strings, now we have C++14 revision: (All i get is no symbols, or questions marks).

#include <iostream>
using std::cout;

int main()
{
    cout << "Ñá" << ".\n"; //Spanish

    cout << "forêt intérêt" << ".\n";  //French

    cout << "Gesäß" << ".\n";  //German

    cout << "取消波蘇日奇諾" << ".\n";  //Chinesse

    cout << "日本人のビット" << ".\n";  //Japanese

    cout << "немного русский" << ".\n";  //Russian

    cout << "ένα κομμάτι της ελληνικής" << ".\n";  //Greek

    cout << "ਯੂਨਾਨੀ ਦੀ ਇੱਕ ਬਿੱਟ" << ".\n";  //Punjabi

    cout << "کمی از ایران " << ".\n"; //Persian

    cout << "కానీ ఈ ఏమి నరకం ఉంది?" << ".\n"; //Telugu

    cout << "Но какво, по дяволите, е това?" << ".\n"; //Bulgarian

    cout.flush();
    return 0;
}

Besides, what's the proper Visual Studio configuration to reach the goal to "cout" this strings properly? (Fonts able to show this characters, unicode settings, etc...).

As far as I know:

In the project properties>General, you must have "Character Set" to "Use unicode character Set".

In C/C++>Preprocessor, you must have Preprocessor definitions as UNICODE.

In VisualStudio menu of Tools>Options>Environment>Fonts and Colors, must use "Lucida Console", or "Consolas" fonts, in text editor and in Output Window, to get a font able to show the characters.

But this isn't enough.

Reaversword
  • 169
  • 1
  • 4
  • 11
  • 2
    Possible duplicate of [How to print Unicode character in C++?](http://stackoverflow.com/questions/12015571/how-to-print-unicode-character-in-c) – user4581301 Oct 29 '15 at 01:55
  • Technically, it's not an *exact* duplicate, as this one is for a *specific* C++ implementation and the other one isn't. – dan04 Oct 29 '15 at 02:04
  • have you looked at http://stackoverflow.com/questions/2849010/output-unicode-to-console-using-c – GreatAndPowerfulOz Oct 29 '15 at 02:16
  • or http://stackoverflow.com/questions/2492077/output-unicode-strings-in-windows-console-app – GreatAndPowerfulOz Oct 29 '15 at 02:17
  • or http://stackoverflow.com/questions/12015571/how-to-print-unicode-character-in-c – GreatAndPowerfulOz Oct 29 '15 at 02:19
  • I've checked the first and the third ones. Using wmain of the second link, I get the output without "Ελληνικά" characters. And a wonderful "Debug assertion Failed!" error if I don't comment my other code (nevermind, is just learning/testing sentences). Checked international keyboard, "console" font for editor & output window, long shaman-voodoo etc of settings here and there. Or my IDE or myself right now be like this: https://www.youtube.com/watch?v=lfft9Jx9gJk – Reaversword Oct 29 '15 at 03:26

2 Answers2

3

Some of those suggested duplicates look out-of-date with C++14, where it’s much easier. This should work, portably, if you have your locale set properly. To do this, you want to set your console font to Lucida Console or Consolas, run chcp 65001 in the console before running the program (or edit the registry to do this by default), set the character set of your source file to multibyte (UTF-8) or Unicode (UTF-16), and set the font of the IDE to Consolas.

If this is too much of a rigmarole, other people have posted instructions for how to change the code page within the program, but you still want to save your source file as UTF-8 so you can use the foreign characters in string constants, and to change your font to one that can display them.

#include <cstddef>
#include <iostream>
#include <locale>

using std::cout;
using std::endl;

constexpr char * const texts[] = {
  u8"Ñá", //Spanish
  u8"forêt intérêt", //French
  u8"Gesäß", //German
  u8"取消波蘇日奇諾", //Chinese
  u8"日本人のビット", //Japanese
  u8"немного русский", //Russian
  u8"ένα κομμάτι της ελληνικής", // Greek
  u8"ਯੂਨਾਨੀ ਦੀ ਇੱਕ ਬਿੱਟ", // Punjabi (wtf?). xD
  u8"کمی از ایران ", // Persian (I know it, from 300 movie)
  u8"కానీ ఈ ఏమి నరకం ఉంది?", //Telugu (telu-what?)
  u8"Но какво, по дяволите, е това?" //Bulgarian
};

constexpr size_t ntexts = sizeof(texts) / sizeof(texts[0]);

int main(void)
{
  std::locale::global(std::locale(""));
  cout.imbue(std::locale());

  for ( size_t i = 0; i < ntexts; ++i )
    cout << texts[i] << endl;

  return EXIT_SUCCESS;
}

You can alternatively make them wide-character strings and use wcout instead of cout. The following might work better in a situation where you don’t have a utf-8 locale set:

#include <cstddef>
#include <iostream>
#include <locale>

using std::wcout;
using std::endl;

constexpr wchar_t * const texts[] = {
  L"Ñá", //Spanish
  L"forêt intérêt", //French
  L"Gesäß", //German
  L"取消波蘇日奇諾", //Chinese
  L"日本人のビット", //Japanese
  L"немного русский", //Russian
  L"ένα κομμάτι της ελληνικής", // Greek
  L"ਯੂਨਾਨੀ ਦੀ ਇੱਕ ਬਿੱਟ", // Punjabi (wtf?). xD
  L"کمی از ایران ", // Persian (I know it, from 300 movie)
  L"కానీ ఈ ఏమి నరకం ఉంది?", //Telugu (telu-what?)
  L"Но какво, по дяволите, е това?" //Bulgarian
};

constexpr size_t ntexts = sizeof(texts) / sizeof(texts[0]);

int main(void)
{
  std::locale::global(std::locale(""));
  wcout.imbue(std::locale());

  for ( size_t i = 0; i < ntexts; ++i )
    wcout << texts[i] << endl;

  return EXIT_SUCCESS;
}

Imbuing with the current locale, as in these examples, should set up the streams to use the right character set on output automatically. The second example is less likely to assume that the strings are in the wrong character set.

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • I've tested your code. First one, shows 3 first lines with wrong symbols and the rest, full of question marks. Second one, shows 3 first lines perfectly OK, and the rest, full of question marks. I don't know how (or what) set to CP65001 on windows (no idea what's this), nevermind, I'been looking fo some info, and appears I need to modify windows registry (with regedit), but I haven't the entry needed for that!. Here: http://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8 – Reaversword Oct 29 '15 at 15:33
  • And besides "set to CP65001", it is neccesary to have, in Visual Studio project properties: General>CharacterSet to Unicode, C/C++>Preprocessor>Preprocessor definitions to Unicode. Tools (Visual Studio menu) >Options>Font and Colors>"Consolas" fond for editor and output window (to have a font able to show all uncommon characters). Something of this steps are unnecessary?. There is lacking any other needed steps? – Reaversword Oct 29 '15 at 15:49
  • And I'm reading here that VC++ doesn't support creating UTF-8 locales: https://social.msdn.microsoft.com/Forums/vstudio/es-ES/2ff45989-6213-495a-a509-0278417eb0ab/utf8-output-using-c-locales?forum=vclanguage – Reaversword Oct 29 '15 at 16:05
  • @Reaversword I don’t have an installation handy to test right now, but when running in the console, set the console font to Lucida Console or Consolas, and run `chcp 65001` to set the code page to UTF-8 first. Wrong symbols generally mean the wrong character set selected, and question marks that a character is not present in the font. – Davislor Oct 29 '15 at 17:26
  • Ok, If I understood, I need to build my progam, and out of Visual Studio, use windows console (cmd.exe), execute the line "chcp 65001", and then, execute my program. Isn't it? If is that, still appears lots of question marks, even (with 2ºnd code), first three lines doesn't show properly. So, I have no idea of how to change cmd.exe font neither how to check wich fond is the default one. Anyway, there is no way to see correctly the output in the own Visual Studio output window? – Reaversword Oct 29 '15 at 19:38
  • Ok, just right click on title bar of the cmd.exe>Properties>Font. Nevermind Lucida Console or Consolas, question marks are still there. But the first three lines are able to read properly. – Reaversword Oct 29 '15 at 19:50
  • @Reaversword The first three lines are the ones that contain only characters from the default code page, 1252. – Davislor Oct 29 '15 at 21:24
  • After add chcp 65001 to command line I can read clearly: Active code page: 65001 So, I'd need to download a languaje pack and change my system languaje to see other lines? (Like download and put my system in Persian)? But, inside Visual Studio, there is a way to use this cpch 65001? – Reaversword Oct 30 '15 at 18:41
  • You shouldn’t need a language pack, just a font that can support it. A non-portable hack is to call`SetConsoleOutputCP(65001)` (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686036(v=vs.85).aspx). You might wrap it in an `#ifdef` block for portable code. – Davislor Oct 30 '15 at 19:48
  • Another MSVC-specific hack is to call `_setmode(_fileno(stdout), _O_U8TEXT);` and `_setmode(_fileno(stderr), _O_U8TEXT);`. – Davislor Oct 30 '15 at 19:57
0

You can use script table from Unicode.org to determine a range you character belongs to.

Probably you need something like g_unichar_get_script function in glib.

Eugene
  • 3,335
  • 3
  • 36
  • 44
  • As far as I know, unicode code for "Ñá" should be cout<<"\u00D1\u00E1"; Range I think is latin-1 Supplement (http://www.unicode.org/charts/), but there is no correct output in that way. There is nothing like boost library to deal with this, or something like that? – Reaversword Oct 29 '15 at 03:47
  • I guess language detection inside one script is a more complex problem requiring some text analysis. I think you should check high level text engines like Pango. – Eugene Oct 29 '15 at 04:08