0

Context

I have a generator that constantly outputs every line from a specific command (see code snippet below, code taken from here).

def execute(cmd):
    popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True, universal_newlines=True)

    for stdoutLine in iter(popen.stdout.readline, ""):
        yield stdoutLine.rstrip('\r|\n')

The issue

The issue is, the stdout line can have special characters that cp1252 could not handle. (see multiple error messages below, each from a different test)

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 6210: character maps to <undefined>
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 3691: character maps to <undefined>
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 6228: character maps to <undefined>

Question

What should I do to handle these special characters?

Programer Beginner
  • 1,377
  • 6
  • 21
  • 47

1 Answers1

1

The solution is quite simple: don't decode the stdout if it's not necessary.

My solution was to add a parameter to the execute function, which determines if the generator will yield a decoded string or untouched bytes.

def execute(cmd, decode=False):
    popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True, universal_newlines=decode)

    for stdoutLine in iter(popen.stdout.readline, ""):
        if decode:
            yield stdoutLine.rstrip('\r|\n')
        else:
            yield stdoutLine.rstrip(b'\r|\n')

Thus when I know that the command I am executing will return ASCII characters and will need a decoded string, then I pass decode=True argument.

Programer Beginner
  • 1,377
  • 6
  • 21
  • 47