1

I want to read a string from the terminal and save it into a std::string as UTF-8.

When writing characters like áéíóú (which occupy 2 bytes) they are read as a single NUL character.

I tried this code in a Linux environment (without the SetConsoleCP things, obviously) and it works, so what do I need to do to make it work in Windows?

#include <iostream>
#include <windows.h>

void PrintStringByChar(const std::string& str) {
    for (int i = 0; i < str.size(); i++) {
        int x = str[i];
        std::cout << "[" << x << "]" << std::endl;
    }
}

int main() {
    SetConsoleOutputCP(CP_UTF8);
    SetConsoleCP(CP_UTF8);

    std::cout << "Write: ";
    std::string text;
    std::getline(std::cin, text);
    PrintStringByChar(text);
}
J. Orbe
  • 153
  • 1
  • 6
  • 2
    `SetConsoleCP(CP_UTF8)` does not work. The console session server (conhost.exe) assumes that the input codepage is a single-byte encoding. It does not support multibyte UTF-8, so non-ASCII characters that require 2-4 bytes are converted to NUL characters when reading via WINAPI `ReadFile` or `ReadConsoleA`. Switch the stdin stream to wide-character or UTF-16 mode, which should (if supported by the CRT version) ultimately call `ReadConsoleW` to read native wide character strings from the console. – Eryk Sun Apr 08 '20 at 16:17
  • 1
    OTOH, since Windows 8 `SetConsoleOutputCP(CP_UTF8)` for the output codepage does work correctly, so you have the option of writing UTF-8 byte strings via `WriteFile` and `WriteConsoleA` (and `std::cout`, etc) as long as you don't need to support Windows 7 and earlier. Be courteous, however. It's not your console; you're just borrowing it. Restore the original codepage when exiting and when spawning child console processes that inherit the current console. – Eryk Sun Apr 08 '20 at 16:23
  • @ErykSun: Interesting, probably you safe my life (in the world of Windows string encoding mess). Are there any official sources for this by e.g. Microsoft? I wonder what are the use cases for `SetConsoleCP(CP_UTF8)` then ? – TeaAge Solutions Nov 28 '22 at 06:13
  • @ErykSun and other readers who landed here: I found another answer (also by Eryk Sun), which from my opinion is a very good explanation! https://stackoverflow.com/questions/39736901/chcp-65001-codepage-results-in-program-termination-without-any-error/39745938#39745938 – TeaAge Solutions Nov 28 '22 at 09:17

0 Answers0