I am using Python requests to get an image file from an image url.
The below code works in most cases, but is starting to fail for more and more urls.
import requests
image_url = "<url_here>"
headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36', 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8','Accept-Encoding':'gzip,deflate,sdch'}
r = requests.get(image_url, headers=headers)
image = Image.open(cStringIO.StringIO(r.content))
If that gives an error then I try with a different header (this solved issues in the past):
headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.76 Safari/537.36', 'Accept':'image/webp,*/*;q=0.8','Accept-Encoding':'gzip,deflate,sdch'}
However, these urls (among others) don't work. They give an "IOError: cannot identify image file" error.
http://cdn.casaveneracion.com/vegetarian/2013/08/vegan-spaghetti1.jpg
http://www.rachaelray.com/site/images/sidebar-heading-more-recipes-2.svg
It shows the images fine in my browser using the urls. I don't know if they have the same issue.