0

I am using Python requests to get an image file from an image url.

The below code works in most cases, but is starting to fail for more and more urls.

import requests
image_url = "<url_here>"
headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36', 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','Accept-Encoding':'gzip,deflate,sdch'}
r = requests.get(image_url, headers=headers)
image = Image.open(cStringIO.StringIO(r.content))

If that gives an error then I try with a different header (this solved issues in the past):

headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36', 'Accept':'image/webp,*/*;q=0.8','Accept-Encoding':'gzip,deflate,sdch'}

However, these urls (among others) don't work. They give an "IOError: cannot identify image file" error.

http://www.paleoeffect.com/wp-content/uploads/2011/06/800x414xpaleo_bread_wheat_recipe-800x414.jpg.pagespeed.ic.6pprrYPoTo.webp

http://cdn.casaveneracion.com/vegetarian/2013/08/vegan-spaghetti1.jpg

http://www.rachaelray.com/site/images/sidebar-heading-more-recipes-2.svg

It shows the images fine in my browser using the urls. I don't know if they have the same issue.

user984003
  • 28,050
  • 64
  • 189
  • 285

1 Answers1

0

You are using the Python Imaging Library (PIL) to provide the Image class mentioned in the last line of your code.

  • The Paleo Effect image is a WebP file. WebP isn't a supported format by PIL.
  • The Casa Veneracion URL does not link to an image file - it returns a 302 Redirect to an HTML file. (See for yourself.)
  • The Rachael Ray image is an SVG file. SVG isn't a supported format by PIL.

See bottom of this documentation for Image formats supported by PIL.

Oddthinking
  • 24,359
  • 19
  • 83
  • 121
  • Hmm, what would be the solution? What works other than PIL? The Casa Veneracion URL does, btw, link to an image for me. (Your link is different). – user984003 Jul 14 '15 at 05:16
  • Use the link I provided to type in the URL for the vegan spaghetti jpg. See that it receives a redirect. – Oddthinking Jul 14 '15 at 06:43
  • How do you support webp? See http://stackoverflow.com/questions/6876502/manipulate-webp-images-in-python – Oddthinking Jul 14 '15 at 06:44
  • How do you support SVG? See http://stackoverflow.com/questions/3600164/read-svg-file-with-python-pil amongst others. – Oddthinking Jul 14 '15 at 06:46