2

I have 30 000 images to check for size, format and some other things.

I've checked all of them except 200 images. These 200 images give an error in Pillow

from PIL import Image
import requests

url = 'https://img.yakaboo.ua/media/wysiwyg/ePidtrymka_desktop.svg'
image = Image.open(requests.get(url, stream=True).raw)

This gives and error:

PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7fbfbf59c810>

Here are some other images, that give the same error:

https://www.yakaboo.ua/ua/skin/frontend/bootstrap/yakaboo/images/logo/y-logo.png
https://img.yakaboo.ua/media/wysiwyg/ePidtrymka_desktop.svg
https://img.yakaboo.ua/media/wysiwyg/ePidtrymka_desktop_futer.svg
https://www.yakaboo.ua/ua/skin/frontend/bootstrap/yakaboo/images/icons/googleplay.png
https://www.yakaboo.ua/ua/skin/frontend/bootstrap/yakaboo/images/icons/appstore.png

If I download these images - everything works fine. But I need to check them without downloading. Is there any solution?

djangodjames
  • 199
  • 1
  • 11
  • See: https://stackoverflow.com/questions/15130670/pil-and-vectorbased-graphics Looks like the `pillow` does not support SVG – MSH May 31 '22 at 13:52

1 Answers1

3
  1. You're not checking for any errors you might get from requests responses, so chances are you might be trying to identify e.g. an error page.
  2. Pillow doesn't support SVG files (and they don't necessarily have an intrinsic size anyway). You'll need something else to identify them.
  3. You're explicitly asking requests to give you the raw stream, not something that may have been e.g. decompressed if there's a transport encoding. For that y-logo.png, the server responds with a response that has Content-Encoding: gzip, so no wonder you're having a hard time. You might want to just not use stream=True and .raw, but instead read the response into memory, wrap it with io.BytesIO(resp.content) and pass that to Pillow. If that's not an option, you could also write a file-like wrapper around a requests response, but it's likely not worth the effort.
  4. To save a bunch of time (by reusing connections), use a Requests session.
AKX
  • 152,115
  • 15
  • 115
  • 172
  • 1
    Requests [docs](https://requests.readthedocs.io/en/master/user/quickstart/#binary-response-content) suggests #3 too – tevemadar May 31 '22 at 14:01
  • Yes. This works. But There is still a problem with .svg ... Is there any solution for .svg files? If Pillow doesn't support SVG files - what other solutions can be used? – djangodjames May 31 '22 at 14:05
  • @djangodjames What information do you need from the files? SVG files are infinitely scalable, so you can't really just print out a "pixel size" for them. – AKX May 31 '22 at 14:09
  • I need resolution and size – djangodjames May 31 '22 at 14:14
  • SVG is a vector format. It has no resolution and image size. – Matthias May 31 '22 at 14:16
  • If I download any .svg image on PC. And look in Image Properties. It has both resolution and size – djangodjames May 31 '22 at 14:18
  • 1
    Well, your PC is lying to you, or guessing some values. You could consider the `width` and `height` (or the `viewBox`) of an SVG to be the size, but since the units for those are entirely arbitrary, they're not of any use. An SVG with a "size" of 1000x1000 could be any size in "real life". – AKX May 31 '22 at 14:21