1

I'm creating a server in c++ and I decided to do it in my native language, but when I compile it, I just get this gibberish and I have no idea how to solve it. I've tried multiple things but visual studio is just a riddle I can't seem to figure out.

    std::cout << ">> Připojení k serveru úspěšné" << std::endl;

When I compile it, it looks like this

>> P?ipojenφ k serveru ·sp?ÜnΘ

Is it hidden somewhere in the settings, or do I have do include some headers?

yakubiq
  • 27
  • 4
  • 4
    `""` isn't a UTF8 string literal to begin with. Use `u8".."` for UTF8 string literals and make sure you save the *file* as UTF8. If you save your file as Latin1, any non Latin1 text will get mangled – Panagiotis Kanavos Jul 10 '23 at 19:41
  • Besides Windows is a *Unicode* OS. All applications work with UTF16, not UTF8 since the late 1990s, when the Windows NT line started. Win32 APIs expect double-byte characters with the ASCII APIs used for compatibility with non-Unicode appliations. Even the "system codepage" is actually the default codepage for non-Unicode applications, *not* the codepages of Windows itself. Use `wstring`, `wcout` or the newer `u16string` etc – Panagiotis Kanavos Jul 10 '23 at 19:45
  • 1
    @PanagiotisKanavos didn't know that, thanks for information, however it doesn't work with `u8""` neither `std::wcout` – yakubiq Jul 10 '23 at 19:51
  • 1
    @yakubiq "*When I compile it, **it** looks like this*" - what does *it* refer to, exactly? Console output? A memory buffer? Communication data? Can you be more specific? – Remy Lebeau Jul 10 '23 at 19:55
  • @RemyLebeau yes, sorry. When I start the debugger, so console output. – yakubiq Jul 10 '23 at 19:59
  • You can't just write UTF-8 text as-is to a console and expect it to display correctly unless you have first configured the console to accept and print UTF-8. You really should be using `std::wcout` instead, but then you should configure the console to accept UTF-16 instead. There are tons of questions on StackOverflow on this topic. – Remy Lebeau Jul 10 '23 at 20:03
  • 1) You have to save the file correctly (if you are using non-Latin1 character in string literals, as you are) 2) You have to configure the console correctly. 3) And even if both these things are done correctly if the console font cannot display your desired Unicode character then nothing is going to work. – john Jul 10 '23 at 20:40
  • Does this answer your question? [How do I print UTF-8 from c++ console application on Windows](https://stackoverflow.com/questions/1371012/how-do-i-print-utf-8-from-c-console-application-on-windows) – JosefZ Jul 10 '23 at 21:19
  • @JosefZ unfortunately not, none of the solutions – yakubiq Jul 11 '23 at 15:25
  • What compiler are you using? Can you provide a complete executable sample code? – Minxin Yu - MSFT Jul 12 '23 at 01:16

1 Answers1

0

Read and follow UTF-8 Everywhere; even Microsoft encourages all Windows users (and developers) in using UTF-8 code pages in Windows apps:

Use UTF-8 character encoding for optimal compatibility between web apps and other *nix-based platforms (Unix, Linux, and variants), minimize localization bugs, and reduce testing overhead.

UTF-8 is the universal code page for internationalization and is able to encode the entire Unicode character set. It is used pervasively on the web, and is the default for *nix-based platforms.

Following above references, I've created an empty console application from scratch (in Microsoft Visual Studio 19), ending up in the following code (saved as UTF-8 encoded .cpp file):

#include <windows.h>
#include <iostream>

#pragma execution_character_set( "utf-8" )

int main(int argc, char* argv[])
{
    // SetConsoleOutputCP(65001);
    // SetConsoleCP(CP_UTF8);
    std::cout << ">>  Ελληνικά  Русский  Češtinář" << std::endl;
    std::cout << ">> Připojení k serveru úspěšné" << std::endl;

    for (int i = 1; i < argc; ++i)
    {
        printf("param %d = %s\n", i, argv[i]);
    }
    return 0;
}

Project properties, C/C++ command line parameters (debug configuration):

/JMC /permissive- /ifcOutput "x64\Debug\" /GS /W3 /Zc:wchar_t /ZI /Gm- /Od /sdl /Fd"x64\Debug\vc142.pdb" /Zc:inline /fp:precise /D "_DEBUG" /D "_CONSOLE" /D "_MBCS" /errorReport:prompt /WX- /Zc:forScope /RTC1 /Gd /MDd /FC /Fa"x64\Debug\" /EHsc /nologo /Fo"x64\Debug\" /Fp"x64\Debug\So76656796.pch" /diagnostics:column 

Output shows that all strings (either hard-coded, or those passed as command line parameters, are treated properly):

So76656796.exe Ελληνικά Русский Češtinář 
>>  Ελληνικά  Русский  Češtinář
>> Připojení k serveru úspěšné
param 1 = Ελληνικά
param 2 = Русский
param 3 = Češtinář
param 4 = 
JosefZ
  • 28,460
  • 5
  • 44
  • 83
  • Hi, thanks a lot. The `#pragma execution_character_set( "utf-8" )` didn't work by itself, but then I added `SetConsoleOutputCP(CP_UTF8);` and it works just fine. – yakubiq Jul 12 '23 at 13:28