Read image from URL and keep it in memory

Question

I am using Python and requests library. I just want to download an image to a numpy array for example and there are multiple questions where you can find different combinations (using opencv, PIL, requests, urllib...)

None of them work for my case. I basically receive this error when I try to download the image:

cannot identify image file <_io.BytesIO object at 0x7f6a9734da98>

A simple example of my code can be:

import requests
from PIL import Image

response = requests.get(url, stream=True)
response.raw.decode_content = True
image = Image.open(response.raw)
image.show()

The main this that is driving me crazy is that, if I download the image to a file (using urllib), the whole process runs without any problem!

import urllib
urllib.request.urlretrieve(garment.url, os.path.join(download_folder, garment.get_path()))

What can I be doing wrong?

EDIT:

My mistake was finally related with URL formation and not with requests or PIL library. My previous code example should work perfectly if the URL is correct.

This is a duplicate of https://stackoverflow.com/questions/23587426/pil-open-method-not-working-with-bytesio — Eolmar, May 04 '18 at 11:36
I think you might be wrong! Where I am supposed to call seek? In the question you mentioned they are writing the image into a file but that exactly what I am trying to avoid — m33n, May 04 '18 at 11:48
Your first code block works fine for me. What version of python/requests/PIL are you using? I used: `Pillow==5.1.0 requests==2.18.4` on python 2.7 — The Pjot, May 04 '18 at 11:49
I only get your error if they URL I'm opening is not actually an image. — The Pjot, May 04 '18 at 11:54

ndpu · Answer 1 · 2018-05-04T12:17:41.103

5

I think you are using data from requests.raw object somehow before save them in Image but requests response raw object is not seekable, you can read from it only once:

>>> response.raw.seekable()
False

First open is ok:

>>> response.raw.tell()
0
>>> image = Image.open(response.raw)

Second open throws error (stream position is on the end of file already):

>>> response.raw.tell()
695  # this file length https://docs.python.org/3/_static/py.png

>>> image = Image.open(response.raw)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/PIL/Image.py", line 2295, in open
    % (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x7f11850074c0>

You should save data from requests response in file-like object (or file of course) if you want to use them several times:

import io
image_data = io.BytesIO(response.raw.read())

Now you can read image stream and rewind it as many times as needed:

>>> image_data.seekable()
True

image = Image.open(image_data)
image1 = Image.open(image_data)

edited May 04 '18 at 12:17

answered May 04 '18 at 12:04

ndpu

22,225
6
54
69

1

This is a great observation! Unfortunately it is not related with the problem I am having. I am working with an API to retrieve image links from there, so I will try to do some investigation around this! – m33n May 04 '18 at 12:21
@m33n i think it is no matter from where you are getting file urls - im testing with image from this file `https://docs.python.org/3/_static/py.png` – ndpu May 04 '18 at 12:27
Yes, it does actually work now with an external link so the problem must be related with my url – m33n May 04 '18 at 12:37
@m33n anyway file-like objects from io module is the way to go if you are working with files in memory – ndpu May 04 '18 at 12:46
I was trying to avoid the writing procedure to optimize my code and process the image directly – m33n May 07 '18 at 07:13

Read image from URL and keep it in memory

EDIT:

1 Answers1