1

I've got a simple python script, foo.py, that outputs a unicode character:

import sys
print(sys.stdout.encoding)
print(b'\xe2\x96\x88'.decode('utf8'))

I want to run it in powershell and pipe the output to Write-Host:

PS> c:\python37\python.exe foo.py | Write-Host

If I do this, the result is:

Traceback (most recent call last):
  File ".\pyen.py", line 3, in <module>
    print(b'\xe2\x96\x88'.decode('utf8'))
  File "C:\python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2588' in position 0: character maps to <undefined>
cp1252

It turns out this isn't even a Write-Host problem. Just assigning the output to a variable, or piping it to Out-Null, give the same error:

PS> c:\python37\python.exe foo.py | Out-Null #Same error
PS> $a = c:\python37\python.exe foo.py #Same error
PS> c:\python37\python.exe foo.py #No error, stdout encoding is printed as utf-8

I've gone down the rabbit hole of why this is happening. Powershell picks the default windows codepage (cp1252) for many things.

This answer offers a couple solutions: Using UTF-8 Encoding (CHCP 65001) in Command Prompt / Windows Powershell (Windows 10)

Unfortunately, changing my $PROFILE to set the input and output encoding doesn't help.

The more permanent solution in that answer of enabling utf-8 systemwide does fix this, but that is a beta feature and can break other things, so I'd rather not go down that road.

I've also played with setting the python environment variable for encoding or modifying the python source, but these aren't great answers either, as that means tweaking or altering any python code whose output I want to pipe to Write-Host.

Any ideas?

aggieNick02
  • 2,557
  • 2
  • 23
  • 36
  • I'm not seeing the point of Write-Host in your example, is there reason you want to use it? – Persistent13 Oct 22 '19 at 22:27
  • Yeah. All of this is actually running in a powershell script that uses `Start-Transcript` and `Stop-Transcript` to keep a log of what happened. The python code is actually invoked by an exe that is run from the powershell script, and without the Write-Host, the output, while it appears in the console, is not logged to the transcript. – aggieNick02 Oct 23 '19 at 17:59
  • I updated the question as this occurs in several other scenarios too - basically doing almost anything with the output causes the same issue. – aggieNick02 Oct 23 '19 at 18:16
  • 1
    Check `[System.Console]::OutputEncoding`; you can set it if necessary as `[System.Console]::OutputEncoding = [System.Text.Encoding]::UTF8` in PowerShell session. – JosefZ Apr 02 '22 at 16:09
  • Check `[System.Console]::InputEncoding` as well. – JosefZ Apr 02 '22 at 16:17
  • @JosefZ Unfortunately the behavior does not change even after setting both `[System.Console]::OutputEncoding` and `[System.Console]::InputEncoding` to `[System.Text.Encoding]::UTF8` – aggieNick02 Apr 04 '22 at 20:40
  • Check `$env:PYTHONIOENCODING`; mine is `utf-8`, and no problem with running your code… – JosefZ Apr 05 '22 at 09:51
  • @JosefZ Indeed that alone is enough to fix it. From my original question, I mentioned reluctance to setting the python environment variable for encoding, but I'm not sure why... maybe I thought I'd have to set it to cp1252 since that is what windows often uses, and worried about the effects of that. It's been a couple years so I honestly don't remember.... Regardless, thanks for the comments and the help; using `$env:PYTHONIOENCODING` seems like a good option. – aggieNick02 Apr 07 '22 at 18:49
  • Indeed, I went and looked at my commit history from around this time, and setting that environment variable was what I settled on. I set it, call my python script, and then restore it, I guess to try to be a good citizen in case some other python script has issues when it is set. – aggieNick02 Apr 07 '22 at 18:55

0 Answers0