1

try to download images with python but only this picture can't download it
i don't know the reason cause when i run it, it just stop just nothing happen no image , no error code ...

here's the code plz tell me the reason and solution plz..

import urllib.request

num=404

def down(URL):

    fullname=str(num)+"jpg"
    urllib.request.urlretrieve(URL,fullname)
    im="https://www.thesun.co.uk/wp-content/uploads/2020/09/67d4aff1-ddd0-4036-a111-3c87ddc0387e.jpg"

down(im)
yeeoon
  • 13
  • 5

2 Answers2

1

this code will work for you try to change the url that you use and see result :

import requests

pic_url = "https://www.thesun.co.uk/wp-content/uploads/2020/09/67d4aff1-ddd0-4036-a111-3c87ddc0387e.jpg"
cookies = dict(BCPermissionLevel='PERSONAL')


with open('aa.jpg', 'wb') as handle:
        response = requests.get(pic_url, headers={"User-Agent": "Mozilla/5.0"}, cookies=cookies,stream=True)
        if not response.ok:
            print (response)

        for block in response.iter_content(1024):
            if not block:
                break

            handle.write(block)

enter image description here

Moetaz Brayek
  • 259
  • 2
  • 9
1

What @MoetazBrayek says in their comment (but not answer) is correct: the website you're querying is blocking the request.

It's common for sites to block requests based on user-agent or referer: if you try to curl https://www.thesun.co.uk/wp-content/uploads/2020/09/67d4aff1-ddd0-4036-a111-3c87ddc0387e.jpg you will get an HTTP error (403 Access Denied):

❯ curl -I https://www.thesun.co.uk/wp-content/uploads/2020/09/67d4aff1-ddd0-4036-a111-3c87ddc0387e.jpg
HTTP/2 403 

Apparently The Sun wants a browser's user-agent, and specifically the string "mozilla" is enough to get through:

❯ curl -I -A mozilla https://www.thesun.co.uk/wp-content/uploads/2020/09/67d4aff1-ddd0-4036-a111-3c87ddc0387e.jpg
HTTP/2 200 

You will have to either switch to the requests package or replace your url string with a proper urllib.request.Request object so you can customise more pieces of the request. And apparently urlretrieve does not support Request objects so you will also have to use urlopen:

req = urllib.request.Request(URL, headers={'User-Agent': 'mozilla'})
res = urllib.request.urlopen(req)
assert res.status == 200
with open(filename, 'wb') as out:
    shutil.copyfileobj(res, out)
Masklinn
  • 34,759
  • 3
  • 38
  • 57
  • No! I am able to visit the webpage. I don't think it is web issue. See my answer. :) –  Jan 22 '21 at 08:57
  • @Istiak I fear you're just plain wrong. The URL is publicly acceptable, you can try the snippet you suggest and you will see that it doesn't work. However `urlopen(Request(url, headers={'User-Agent': 'mozilla'}))` will immediately return with a status of 200, and the data. – Masklinn Jan 22 '21 at 09:07
  • publicly *availabie* not acceptable (though it's that as well). You are able to visit the page because your browser sends a browser's user-agent, and the sun filters requests on that basis. – Masklinn Jan 22 '21 at 09:13