0

I'm running up against what I assume is some strange encoding error, but it's really baffling me. Basically I'm trying to write a unicode string to a file as an image, and the string representation is printed fine.

ìԉcïԁiԁúлt cúɭpâ ρáncéttá, ëɑ ëɭìt haϻ offícìà còлѕêɋûät. Sunt ԁësërúлt

but any way I try to write the string out to any relevant place I get the standard ascii encoding error:

UnicodeEncodeError: 'ascii' codec can't encode characters 0-3: ordinal not in range 128

I've tried setting the encoding of my source files, and ensuring that my system variable isn't set to ascii, and I've tried directly outputting to a file via:

python script.py > output.jpg

and none of it seems to have any effect. I feel a little silly for not being able to solve a simple encoding issue, but I've really got no clue as to where the ascii codec is even coming from at this point.

Relevant code:

def random_image(**kwargs):
    image_array = numpy.random.rand(kwargs["dims"][0], kwargs["dims"][1], 3)*255
    image = Image.fromarray(image_array.astype('uint8')).convert('RGBA')
    format = kwargs.get("format", "JPEG")
    output = StringIO.StringIO()
    image.save(output, format=format)
    content = output.getvalue()
    output.close()
    content = [str(ord(char)) for char in content]
    return content
Slater Victoroff
  • 21,376
  • 21
  • 85
  • 144

1 Answers1

1

The first question is why do you store the contents of your image in the form of a Unicode string? Images typically contain arbitrary octets and should be represented with str (bytes in Python 3), not with the unicode type.

When you print a Unicode string to the screen, encoding is chosen based on the environment settings. When you print it to the file, you need to specify an encoding, otherwise ascii is assumed. To have your program default to something more sane for files, start it with:

encoding = sys.stdout.encoding or 'utf-8'
sys.stdout = codecs.getwriter(encoding)(sys.stdout, errors='replace')
user4815162342
  • 141,790
  • 18
  • 296
  • 355