0

I'm using urlretrieve from the urllib.request library to download images from a website. My code is slow af. it took 12 min to save 4 images (64x64 and png). This isn't normal as i've tested it on other sites and it works way faster (i mean 3 minutes for one image is not normal). Is the problem coming from the website or my computer (i have a great network). Here is the code :

import urllib.request
from PIL import Image
import os.path
import json

#Load and edit latest crypto data for cards
with open("json/latest_crypto.json", 'r') as latest_crypto_json:
    latest_crypto = json.load(latest_crypto_json)
    del latest_crypto["status"]

for i in latest_crypto['data']:
    logo_online_adress = "https://s2.coinmarketcap.com/static/img/coins/64x64/{}.png".format(i)
    logo_local_adress = "misc/cryptoLogo/{}.png".format(i)
    if not os.path.exists(logo_local_adress):
        urllib.request.urlretrieve(logo_online_adress, logo_local_adress)
        current_logo = Image.open(logo_local_adress)
        if current_logo.size != (64, 64):
            resized_logo = current_logo.resize((64,64))
            resized_logo.save(logo_local_adress)
            print(i+" import with resize")
        else:
            print(i+" import without resize")
    else:
        print(i+" already exist")

For context, i'm collecting cryptocurrencies logo from CoinMarketCap for later use in HTML code.

I'm proceeding to a check to see if it already exist on the destination folder and if not, i get it and resize if it needs to.

This might be messy but everything around this line work as intended :

 urllib.request.urlretrieve(logo_online_adress, logo_local_adress)

My only problem is speed. I can't use this script as it is right now cause it is way too slow.

Max
  • 1
  • Maybe the remote site guessed that you're a robot (which you are!) and is deliberately serving those images very slowly? – John Gordon Feb 27 '22 at 19:49

2 Answers2

0

You could try to use curl to get the picture and see if that is faster - if not then try your web-browser.

If that is faster then You may have to pose as a browser-client, setting same headers as the browser does.

Some people do a lot to fight off other peoples automation.

user2692263
  • 475
  • 4
  • 8
  • Thanks for your reply. I'm gonna check this out. Hope it will get it to run faster ! – Max Feb 27 '22 at 21:04
0

You can try with this, in my case requests was faster than urllib, so I wrote this to:

  • monitor speed
  • write directly on file, with custom block size

https://stackoverflow.com/a/75261338/5053475