0

I'm trying to use WriteConsoleOutput from the WinApi to write characters to the command prompt window buffer. The thing is, I'd really like to be able to write characters such as directly into the source code, as-is, instead of using some kind of encoding/notation like '\uFFFF' or '0xFF', since I don't understand them too well (differences between codepages/character sets/etc.)

The code below showcases the simplest form of my problem. Running this code does not print into the command prompt window, but a question mark (?) instead.

#include <Windows.h>

int main()
{
    HANDLE h = GetStdHandle(STD_OUTPUT_HANDLE);
    CHAR_INFO c[1] = {0};
    COORD cS = {1, 1};
    COORD cH = {0, 0};
    SMALL_RECT sr = {0, 0, 0, 0};

    c[0].Attributes = FOREGROUND_INTENSITY;
    c[0].Char.UnicodeChar = '☺';
    WriteConsoleOutput(h, c, cS, cH, &sr);
    Sleep(5000);
    return 0;
}

It is vital for my code to display output identically between all Windows versions, regardless of the languages installed/used. So to my knowledge (which admittedly is absolutely minimal), I'd need to set a specific codepage (one which would hopefully be supported by the command prompt in any language Windows).

I've tried:
• Changing from using the CHAR_INFO.UnicodeChar to CHAR_INFO.AsciiChar
• Fiddling around with SetConsoleCP and SetConsoleOutputCP functions, but I haven't got a clue on how to utilize them to help me with this problem.
• Changing the Visual Studio -> Project -> Project properties.. -> Character Set setting to every possible value.
• Using specifically either WriteConsoleOutputA or WriteConsoleOutputW in addition to the aforementioned settings
• Changing the source code file encoding to UTF-8 with(/out) signature.


In my project I'm programmatically setting the command prompt font to 8x8 Terminal, which to my knowledge does not support actual unicode characters. The available characters are displayed here. Those characters do include '☺', so I'm not entirely sure my question is about unicode. I have no idea anymore. Please help.

  • 2
    You'll have to use c[0].Char.UnicodeChar = L'☺'; And be sure that the compiler understands your source code, use File > Advanced Save Options > select "Unicode (UTF-8 with signature) - Codepage 65001". The BOM that now is emitted is enough to let the compiler know that the source code is encoded in utf8. – Hans Passant Oct 06 '16 at 17:00
  • @hans-passant That gives me a `:` (colon) instead of a smiley face. Even with the encoding saved as UTF-8. – comesuccingfuccslot Oct 06 '16 at 17:01
  • 1
    The character is U+263A, 3A == ':'. The L is important. – Hans Passant Oct 06 '16 at 17:05
  • @user6003859 Does your source code file have the UTF-8 BOM mark? Recent updates to the compiler support alternative means to specify the encoding of the source code - see [New Options for Managing Character Sets in the Microsoft C/C++ Compiler](https://blogs.msdn.microsoft.com/vcblog/2016/02/22/new-options-for-managing-character-sets-in-the-microsoft-cc-compiler/). Make sure you are setting `c[0].Char.UnicodeChar` and calling `WriteConsoleOutputW`. – Ian Abbott Oct 06 '16 at 17:09
  • @HansPassant @IanAbbott AH! Finally! Using the `L` macro, with `UnicodeChar`, UTF-8 encoding on the file and `WriteConsoleOutputW()` finally worked. I thought I had tried all the variations on those choices. Thank you so much! I can't upvote comments, so whoever adds an answer with the aforementioned details will get the "Accepted" thingy. Thank you again. – comesuccingfuccslot Oct 06 '16 at 17:23

1 Answers1

1

C source has to be ascii only. If you embed non-ascii characters in a C source file, and IDE might show them in what appears to be the correct format, but the compiler quite likely treats them differently, and the executable function you pass them to can treat them differently still. It's just not portable or reliable. But you can use the escape sequence \x to embed arbitrary bytes in C strings.

UTF-8 is good for internal use, but Windows APIs don't yet support it, so you need to convert to Windows 16 bit chars (UTF-16 nearly but not quite), to display extended characters. However you have to ensure that you are calling the wide character version of the Windows API. Most Windows API functions that take string come in a A and W version (ascii and wide) for binary backwards compatibility. If you query the identifier in the IDE (go to definition etc) you should see which version you have.

Malcolm McLean
  • 6,258
  • 1
  • 17
  • 18