console application can use different ways for output.
- for console handle we can use
WriteConsoleW
for output already in
UNICODE
.
- if we want use
WriteConsoleA
or WriteFile
for console
handle need first convert UNICODE
text to multi-bytes by
WideCharToMultiByte
with CodePage :=
GetConsoleOutputCP()
- if we have not
UNICODE
text initially for output (say UTF-8
or
Ansi
), need first convert it to UNICODE
by
MultiByteToWideChar
(with CP_UTF8
or CP_ACP
) and then
already again convert it to multi-byte WideCharToMultiByte(GetConsoleOutputCP(), ..)
usual (by default) GetConsoleOutputCP()
return same value as GetOEMCP()
, so have the same effect in MultiByteToWideChar
and WideCharToMultiByte
as CP_OEMCP
(this constant value is translated to GetOEMCP()
)
when output handle is redirected to a file need only use WriteFile
only. however application can write data to file in any format: UNICODE
, Ansi
(CP_ACP
) , UTF-8
(CP_UTF8
) etc. what is format will be used - very depend from concrete application. you can not full control this. usual you will receive multi-byte output in CP_OEMCP
encoding. then you need decide how process it - faster of all you will be need first convert it to UNICODE
and use unicode
form. if you need Ansi
- you will be need do else one conversion.
say if you try use pipe output in CP_OEMCP
encoding with OutputDebugStringA
- you got error (not readable) output for non english text.
but after 2 conversions CP_OEMCP
-> UNICODE
-> CP_ACP
you can correct displayed text with OutputDebugStringA
but because OutputDebugStringW
exist - here enough only to UNICODE
convert
also some applications have special options for control output to file format. say ipconfig.exe
looking for "OutputEncoding"
Environment Variable and depended from it string value ("Unicode"
, "Ansi"
, "UTF-8"
) produce different output. by default (if this Environment Variable not exist or unknown value) CP_OEMCP
used
example of pipe read procedure. assume that input data in CP_OEMCP
encoding:
void OnRead(PVOID buf, ULONG cbTransferred)
{
if (cbTransferred)
{
if (int len = MultiByteToWideChar(CP_OEMCP, 0, (PSTR)buf, cbTransferred, 0, 0))
{
PWSTR pwz = (PWSTR)alloca((1 + len) * sizeof(WCHAR));
if (len = MultiByteToWideChar(CP_OEMCP, 0, (PSTR)buf, cbTransferred, pwz, len))
{
if (g_bUseAnsi)
{
if (cbTransferred = WideCharToMultiByte(CP_ACP, 0, pwz, len, 0, 0, 0, 0))
{
PSTR psz = (PSTR)alloca(cbTransferred + 1);
if (cbTransferred = WideCharToMultiByte(CP_ACP, 0, pwz, len, psz, cbTransferred, 0, 0))
{
DoPrint(psz, cbTransferred, OutputDebugStringA);
}
}
}
else
{
DoPrint(pwz, len, OutputDebugStringW);
}
}
}
}
}
// debugger can incomplete print too big buffer, so split it on small chunks
template<typename T> void DoPrint(T* p, ULONG len, void (WINAPI* fnOutput)(const T*))
{
ULONG cb;
T* q = p;
do
{
cb = min(len, 256);
q = p + cb;
T c = *q;
*q = 0;
fnOutput(p);
*q = c;
p = q;
} while (len -= cb);
}
about your concrete case - ipconfig.exe
used WriteConsoleW
for output to console. as result it not depended from current system locale and can correct display multilanguage text. but another tools, like route.exe
used WriteFile
for ouput (both to console and file) and convert before this UNICODE
text to multi-byte by WideCharToMultiByte(CP_OEMCP,..)
- as result here will be problems, if try display characters which not exist in CP_OEMCP
code page (current system locale). if you have CP437
- Hebrew and Russian characters will be lost if use UNICODE
-> CP_OEMCP
, need only direct ouput with unicode to console and file. are this possible - dependend from concrete application. for say route.exe
this not possible. for ipconfig.exe
this possible, because it always write to console in unicode format, and can write to file also in unicode
or utf-8
if you set "OutputEncoding"
to "Unicode"
or "UTF-8"