4

I am getting this error message:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

From this code:

from PIL import Image
import os

image_name = "mypic"
count = 0

for filename in os.listdir('pictures'):
    if filename.endswith('.jpg'):
        image_file = open('pictures/' +filename)
        image = Image.open(image_file)

        # next 3 lines strip exif
        data = list(image.getdata())
        image_without_exif = Image.new(image.mode, image.size)
        image_without_exif.putdata(data)

        image_without_exif.save('new-pictures/' + image_name + str(count) + '.jpg')
        count = count + 1;

Not sure why, as this was working yesterday...

martineau
  • 119,623
  • 25
  • 170
  • 301
fusilli.jerry89
  • 309
  • 3
  • 11

3 Answers3

3

I think you need to open the file in binary mode:

image_file = open('pictures/' +filename, 'rb')
Sam Comber
  • 1,237
  • 1
  • 15
  • 35
  • This will work too, and is likely a better route if directly manipulating the file is necessary (for example [steganography](https://en.wikipedia.org/wiki/Steganography)). Similarly, PIL.Image will accept any bytes-like object (such as a `BytesIO`) which _"must implement read(), seek(), and tell() methods, and be opened in binary mode"_ [docs](https://pillow.readthedocs.io/en/latest/reference/Image.html#PIL.Image.open) – ti7 Jul 08 '20 at 16:35
3

This happens because open is trying to read the file as text. You can resolve this by opening the path directly with Image.open()

img = Image.open('pictures/' + filename)

This works because PIL does the related handling for you internally; take a look at its documentation here for more!
https://pillow.readthedocs.io/en/latest/reference/Image.html#PIL.Image.open

Further, it probably makes even more sense to use Image.open as a context manager to handle opening and closing your image when done (there's a good explanation here)

with Image.open('pictures/' + filename) as img:
    # process img
# image file closed now after leaving context scope
ti7
  • 16,375
  • 6
  • 40
  • 68
0

When using the open(filename) function without any further arguments, you open the file in "text" mode.

Python will assume that the file contains text when reading it. When it finds a byte with the value of 255 (0xFF), it is confused because no text character matches that byte.

To fix this, open the file in bytes mode:

open(filename, "b")

This tells python to not assume it contains text and the file handle will just give out the raw bytes instead.

Because this is a common use-case, PIL already has opening images by filename built in:

Image.open(filename)
Azsgy
  • 3,139
  • 2
  • 29
  • 40