0

Here’s the code I used:

import requests
from PIL import Image
import io
import cv2
response = requests.get(df1.URL[0]).content

im = Image.open(io.BytesIO(response))

The image is very large. Is there a way to fasten things? EDIT: I don't want to save image on disk. I just want to read it on the fly.

user
  • 93
  • 10

2 Answers2

2

Well, to have this question clear as for now. Could you please check the timing between your code and the code below on a single image. and let us know the difference.

In case if you looking to deal with multiple images, so you need threading etc.. concurrent.futures

import requests

r = requests.get(url)

with open("out.jpg", 'wb') as f:
    f.write(r.content)

also kindly set stream=True and give it a try

import requests
from PIL import Image
import io
import cv2
response = requests.get(df1.URL[0],stream=True).content

im = Image.open(io.BytesIO(response))
  • Why will this read the image faster? It appears to use the same `requests.get()` as the OP. – Mark Setchell Jan 02 '20 at 09:54
  • @MarkSetchell the point here is `PIL` check https://stackoverflow.com/questions/59495998/unable-to-download-image-using-request-in-python-url-send-me-html-in-respons/59496067#59496067 – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 10:15
  • @αԋɱҽԃαмєяιcαη Is there a way to not save image? I don't have much disk space on google colab. So this will be a problem going forwards – user Jan 02 '20 at 10:25
  • @bookfreak what's the size of `pic` is ? – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 10:32
  • @ αԋɱҽԃαмєяιcαη. It varies but from 3 Mo and up. And there are 5000 of them. Each url fetches one image. But I mean overall I'll fetch 5000 images. – user Jan 02 '20 at 10:34
  • @bookfreak check my updated answer and let me know. – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 10:34
  • @αԋɱҽԃαмєяιcαη that's my initial code. It's the one that loads for a very long time. Your initial answer does the job quickly but has the problem of storage. – user Jan 02 '20 at 10:37
  • @bookfreak pay attention that i used `StringIO` – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 10:38
  • @ αԋɱҽԃαмєяιcαη Sorry didn't see it. It gives TypeError: initial_value must be str or None, not bytes. – user Jan 02 '20 at 10:44
  • I have read the link you provided but I still fail to see why or how `response = requests.get(df1.URL[0]).content` can run at a different speed from `r = requests.get(url)` – Mark Setchell Jan 02 '20 at 10:50
  • @MarkSetchell I'm not about `requests.get`, I just telling you that `PIL` causing a slowness while opening the image on the fly, that's why i commented for you with `the point here is PIL`. Indeed `requests.get(df1.URL[0]).content` is equal to `r = requests.get(url)` but I'm about the operation which is done after the requests. there's big difference between downloading the image, and streaming the image on the fly. – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 10:53
  • So, if the difference is not in `requests.get()`, are you saying that `img = Image.open(StringIO(response.content))` is significantly faster than `im = Image.open(io.BytesIO(response))` ? – Mark Setchell Jan 02 '20 at 10:57
  • @MarkSetchell Sure not !!! I'm saying that downloading the image and then opening it is pretty faster than streaming it while on the fly !! the point of `StringIO` is for checking different thing from `bytes` to `string` – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 10:58
  • I still don't understand. OP wrote `requests.get()` followed by `Image.open()` and you suggested `requests.get()` followed by `Image.open()` and you say neither of your 2 lines are faster than the OP's two lines, so how can your code be faster? – Mark Setchell Jan 02 '20 at 11:01
  • well you are missing a part, again i repeat. `f.write(r.content)` is different than `im = Image.open(io.BytesIO(response))` and regarding `Image.open()` after i suggested `stream=True` – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 11:02
  • `f.write()` is about saving an image. OP was asking about *"Reading from a URL"*, and specifically says in the comments that he wants to avoid saving it! – Mark Setchell Jan 02 '20 at 11:05
  • @MarkSetchell I did not look to the title, as i were working on the content. https://stackoverflow.com/posts/59561093/revisions as `EDIT: I don't want to save image on disk. I just want to read it on the fly.` was an EDIT – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 11:06
0

Thank you @αԋɱҽԃαмєяιcαη for your help. I made a comparison between duration of loading for different methods. Here's a link to see results comparison

user
  • 93
  • 10
  • kindly be informed that `answer` section is not for replying me back. please edit your question and include the comparison details. and you can delete this answer by clicking on `delete` under the answer. – αԋɱҽԃ αмєяιcαη Jan 02 '20 at 15:32