0

Currently, I'm scripting a small python application that executes a PowerShell script. My python script should process the returned string but unfortunately, I have some trouble with the encoding of special characters like 'ä', 'ö', 'Ü' and so on. How can I return a Unicode/UTF-8 string?

You can see a simple example below. The console output is b'\xc7\xcf\r\n'. I don't understand why it's not b'\xc3\xa4\r\n' because \xc3\xa4 should be the correct UTF8 Encoding for the character 'ä'.

try.py:

import subprocess
p = subprocess.check_output(["powershell.exe", ".\script.ps1"])
print(p)

script.ps1:

return 'ä'

I adopted my PowerShell script in some ways but did not get the desired result.

  1. Added "[Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8". Result: b'\xc3\x83\xc2\xa4\r\n'

  2. Returned return [System.Text.Encoding]::UTF8.GetBytes("ä"). Result: b'195\r\n131\r\n194\r\n164\r\n'

Who can help to get console output of 'ä' for my upper script?

arajshree
  • 626
  • 4
  • 13
PSaR
  • 23
  • 1
  • 4

1 Answers1

1

I used "pwsh" because I ran it on mac, you can use "powershell.exe" in your code Try this:

import subprocess

p = subprocess.check_output(["pwsh", ".\sc.ps1"])
print(p.decode('utf-8'))

For more: You can read here.

Working Screenshot

Nitin
  • 246
  • 1
  • 7
  • I have installed pwsh and tested it with both and added print(p.decode('utf-8')) to my try.py script. I get from pwsh the string b'\x84\r\n' and from powershell b'\xc7\xcf\r\n'. Result for pwsh is: `Traceback (most recent call last): b'\x84\r\n' print(p.decode('utf-8')) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 0: invalid start byte` – PSaR Jun 27 '19 at 10:39