67

866 charset installed by default in Windows' cmd.exe is poor and inconvinient as compared with glorious Unicode.

Can I install Unicode by default or replace cmd.exe to another console and make it default so programms use it instead of cmd.exe?

I understand that chcp 65001 changes encoding only in the running console. I want to change charset at the system level.

Zombo
  • 1
  • 62
  • 391
  • 407
Doctor Coder
  • 1,001
  • 1
  • 10
  • 17
  • 2
    866 is a code page for Cyrillic script. Changing it is quite liable to break any old console mode program that expect that page to be the default. It is not like you won't notice, you can't read the program's output anymore. – Hans Passant Jan 01 '13 at 14:11
  • 6
    There is no such thing as Unicode charset in cmd.exe. `chcp 65001` provides some UTF-8 decoding but it's very rudimentary and doesn't provide proper input. – Alastair McCormack Jan 01 '16 at 17:46
  • See https://stackoverflow.com/a/33475373/3027266 – Wernfried Domscheit Aug 05 '21 at 14:11

4 Answers4

57

After I tried algirdas' solution, my Windows crashed (Win 7 Pro 64bit) so I decided to try a different solution:

  1. Start Run (Win+R)
  2. Type cmd /K chcp 65001

You will get mostly what you want. To start it from the taskbar or anywhere else, make a shortcut (you can name it cmd.unicode.exe or whatever you like) and change its Target to C:\Windows\System32\cmd.exe /K chcp 65001.

Superbest
  • 25,318
  • 14
  • 62
  • 134
VerteXVaaR
  • 745
  • 5
  • 17
  • You meant "cmd.unicode.bat" not "..exe", right? If really do, how can I make exe file win Win10? – Martin Ille Oct 15 '16 at 23:39
  • 3
    is it possible to automatically add that command in batch file? I mean like this `START cmd /K chcp 65001 START DTRé.xls` – Regie Baguio Mar 03 '17 at 23:02
  • 6
    With codepage 65001, the console in Windows 7 (not Windows 8+) incorrectly returns the number of decoded wide-character code points for UTF-8 written to it, rather than the number of bytes written to it, which `WriteFile` is *supposed* to return. This causes applications to retry writing what they mistakenly determine is the remaining part of the byte string, repeatedly until the console returns that all 'bytes' (actually decoded wide characters) have been written. The result looks like a trailing stream of garbage characters after each print that contains non-ASCII characters. – Eryk Sun Sep 14 '17 at 22:15
  • 7
    With codepage 65001, the console in all versions of Windows (even the new console in Windows 10) does not support non-ASCII input. The size of the scratch buffer it uses to encode its Unicode input buffer is based on the system ANSI codepage, which typically is 1 byte per character. But non-ASCII UTF-8 is 2-4 bytes per character. Thus encoding non-ASCII input fails in the console. However, `ReadFile` returns that it 'successfully' read 0 bytes. Most programs interpret this as EOF, and a REPL/shell will typically exit in this case. – Eryk Sun Sep 14 '17 at 22:21
  • @Martin - bat2exe – Berry Tsakala Jun 09 '19 at 10:59
19

Open an elevated Command Prompt (run cmd as administrator). query your registry for available TT fonts to the console by:

    REG query "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont"

You'll see an output like :

    0    REG_SZ    Lucida Console
    00    REG_SZ    Consolas
    936    REG_SZ    *新宋体
    932    REG_SZ    *MS ゴシック

Now we need to add a TT font that supports the characters you need like Courier New, we do this by adding zeros to the string name, so in this case the next one would be "000" :

    REG ADD "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont" /v 000 /t REG_SZ /d "Courier New"

Now we implement UTF-8 support:

    REG ADD HKCU\Console /v CodePage /t REG_DWORD /d 65001 /f

Set default font to "Courier New":

    REG ADD HKCU\Console /v FaceName /t REG_SZ /d "Courier New" /f

Set font size to 20 :

    REG ADD HKCU\Console /v FontSize /t REG_DWORD /d 20 /f

Enable quick edit if you like :

    REG ADD HKCU\Console /v QuickEdit /t REG_DWORD /d 1 /f
Alon Or
  • 798
  • 7
  • 7
  • 1
    Minus 1: UTF-8 is only partially supported in console windows, and only for output. – Cheers and hth. - Alf Nov 04 '17 at 21:10
  • 4
    This Was a straight answer to OP question: "How to make Unicode charset in cmd.exe by default?", with all the steps as clear as possible, as already explained by Alastair McCormack :"There is no such thing as Unicode charset in cmd.exe. chcp 65001 provides some UTF-8 decoding but it's very rudimentary and doesn't provide proper input." So I don't see the need to explain the Pro's & Con's of the given answer. – Alon Or Nov 06 '17 at 12:17
  • Actually output of first `req query` is showing me only sqares on rows 3 and 4 – realtebo Feb 28 '18 at 07:27
  • That's because CMD cannot output those characters, add UTF-8 support and you should be able to see them. – Alon Or Mar 02 '18 at 06:37
  • Thank you very much, I have fixed strange symbols that start appearing on CMD when I try to use some CLI commands – mohagali Mar 23 '22 at 08:19
10

Save the following into a file with ".reg" suffix:

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Console\%SystemRoot%_system32_cmd.exe]
"CodePage"=dword:0000fde9

Double click this file, and regedit will import it.

It basically sets the key HKEY_CURRENT_USER\Console\%SystemRoot%_system32_cmd.exe\CodePage to 0xfde9 (65001 in decimal system).

Shaohua Li
  • 369
  • 3
  • 8
0

For me, for Visual Studio 2022, it worked when I executed this ".reg" command.

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Console\C:_Program Files_Microsoft Visual Studio_2022_Professional_Common7_IDE_CommonExtensions_Platform_Debugger_VsDebugConsole.exe]
"CodePage"=dword:0000fde9

It is based on @Shaohua Li's answer: https://stackoverflow.com/a/24711864/2941313. It does the same thing but for different path (specifically for VS2022 console).

Lech Osiński
  • 512
  • 7
  • 14