1

I am currently working on exercises from a book called "Learn Python The Hard Way". On this particular exercise, it requires to set my computer to work with utf-8 files, but I am not able to configure this.

I have tried reading different blogs and answers in here but can't find out how to do this.

The image below shows (left)how it should look and (right)what I am getting on the right enter image description here

Thank you for the help!

Built13
  • 125
  • 1
  • 3
  • 7
  • What does it mean to set your computer to work with utf-8 files? – Stop harming Monica Apr 22 '18 at 16:56
  • I apologize, I meant to write unicode. Thank you for pointing that out. Any help with this I would highly appreciate it Goyo – Built13 Apr 22 '18 at 17:51
  • The console should not be set to UTF-8 via chcp. This sets the input codepage to UTF-8, which is broken in all versions of Windows since it limits legacy console input (i.e. via `ReadFile` and `ReadConsoleA`) to 7-bit ASCII. Prior to Windows 10 it causes input with even a single non-ASCII character to look like EOF (i.e. a read of 0 characters), and in Windows 10 the non-ASCII characters are replaced with NUL characters. Setting the output codepage to 65001 works ok in Windows 8 and 10 for legacy output (`WriteFile` and `WriteConsoleA`), but it's also broken in Windows 7 with non-ASCII text. – Eryk Sun Apr 23 '18 at 02:04
  • If you need Unicode output to the console in Python 2, install and enable the win_unicode_console package. For Python 3, use at least version 3.6, which has been updated to use the console's wide-character API instead. Note that none of this has anything to do with the default encoding used for files and pipes in Python 3. It defaults to the system locale's ANSI encoding (e.g. codepage 1252 in Western locales). For standard I/O redirected to files and pipes (i.e. not to the console), set the `PYTHONIOENCODING` environment variable to override the default. – Eryk Sun Apr 23 '18 at 02:05
  • 1
    See this thread. Set it in `Region->Languange` and use `utf-8` system wide. https://stackoverflow.com/a/15858812/7720976 – W.Perrin May 22 '19 at 02:54

1 Answers1

2

You got the console codepage setting right. Your tutorial also has you setting PowerShell's $OutputEncoding. You can set it to match the console. You can even have this done each time you open a PowerShell console.

"chcp 65001 1>null " | Out-File -Encoding UTF8 -Append -PSPath (Get-Item variable:PROFILE).Value

"$OutputEncoding = [Console]::OutputEncoding" | Out-File -Encoding UTF8 -Append -PSPath (Get-Item variable:PROFILE).Value

Write-Warning "Reopen Powershell console"

Note: You also need to configure the console with a font that supports the characters that you'll use. You'll have to experiment with this. (There are fonts that support all defined codepoints but they do so by showing only the codepoint number for many codepoints.)

Tom Blodget
  • 20,260
  • 3
  • 39
  • 72