I'm not quite understand how std::setlocale
works.
Here is my simple program
// main.cpp
#include <iostream>
#include <clocale>
int main(void) {
std::string str = u8"Привет, мир";
std::cout << str << std::endl;
setlocale(LC_ALL, ".UTF8");
std::cout << str << std::endl;
return 0;
}
Prerequisties
The code is compiled in Visual Studio 2022 (Version 17.3.6), with CL version 19.33.31630.
Program is running in Windows 10 (21H2 19044.2728) in PowerShell terminal with CP1251 encoding.
PS> $PSVersionTable.PSVersion.ToString()
5.1.19041.2673
PS> [Console]::OutputEncoding
IsSingleByte : True
BodyName : koi8-r
EncodingName : Кириллица (Windows)
HeaderName : windows-1251
WebName : windows-1251
WindowsCodePage : 1251
IsBrowserDisplay : True
IsBrowserSave : True
IsMailNewsDisplay : True
IsMailNewsSave : True
EncoderFallback : System.Text.InternalEncoderBestFitFallback
DecoderFallback : System.Text.InternalDecoderBestFitFallback
IsReadOnly : False
CodePage : 1251
Here the result of execution:
PS D:\VS\Projects\playground\x64\Debug> .\playground.exe
Привет, мир
Привет, мир
Question
The first line gibberish is ok, I've expected that.
But why the second line (after setlocale
call) is not gibberish?
As far as I understand setlocale documentation this function affects only locale-dependent function, like std::toupper
, std::isalpha
, etc. There are no mentions about changing stdout
encoding at all.
I thought that std::cout
just put bytes from std::string
to stdout
, but it seems it has smarter behaviour.
It seems that std::cout
checks terminal encoding, and if it has other encoding that is set by setlocale
automatically convert bytes from "program locale" to "terminal locale".
Is this behaviour cross-platform and described in standard?