0

There is an image on a webpage which I would like to save on my disk using python. What I tried to do was

r=requests.get(url, timeout=60)
p=os.path.sep.join([args["output"],"{}.jpeg".format(str(total).zfill(5))])
f.write(r.content)
f.close()

But I realized that the file saved is not in image format as

$file name_of_file  
00018.jpeg: HTML document, ASCII text, with very long lines, with no line terminators

Then I tried to:

    r=requests.get(url, timeout=60)
    p=os.path.sep.join([args["output"],"{}.jpeg".format(str(total).zfill(5))])
    f=open(p, "wb")
    i=r.raw
    q=Image.open(BytesIO(r.content))
    print(q.type)
    f.write(i)
    f.close()

But with no success. What should I do?

UPDATE:

r = requests.get(url, timeout=60)
    # save the image to disk
    p = os.path.sep.join([args["output"], "{}.jpeg".format(
    str(total).zfill(5))])

    with open("test.jpeg","wb+") as f:
        f.write(requests.get("name_of_website",headers=headers).content)


    f.close()

When I copied the image manually from the web using cursor, it was a jpg format.

Hrushi
  • 459
  • 1
  • 5
  • 15

2 Answers2

2

This page needs cookie to do that: enter image description here

If not,you can not visit it directly.

An easy way is add a cookie in your request header:

import requests

headers = {
    "Cookie":"visid_incap_276192=vO9ugmNqRS+XGehZnF1jiwL8kl4AAAAAQUIPAAAAAADc6Z+46+Lp6X9DL0FUaSOv; incap_ses_627_276192=HgPZUq1t1yD2FURXnY2zCAL8kl4AAAAAyQ+1ZeYdSVzPTcurvHnlwA==; JSESSIONID=0001Zh35TV6HDxcVflnHMwIHsqe:-1801K8D; incap_ses_553_276192=XuxOZn9AsVOTcVuFwKasB3P9kl4AAAAAaxsIzIzT5BwV8RqhcTVPsw==",
}

with open("test.jpg","wb+") as f:
    f.write(requests.get("https://www.e-zpassny.com/vector/jcaptcha.do",headers=headers).content)

Now it can download the image successfully: enter image description here

jizhihaoSAMA
  • 12,336
  • 9
  • 27
  • 49
  • The files get downloaded by when I open them they're empty. `file 00008.jpg' gives `00008.jpg: empty` – Hrushi Apr 12 '20 at 11:59
  • I have updated my question on how I ran the script, I also added the cookies – Hrushi Apr 12 '20 at 12:00
  • Why don't use `print(response.context)` to check the bytes of response?if it is `b'xxxxx'`,represents you visit the image successfully. – jizhihaoSAMA Apr 12 '20 at 12:02
0

I think you should do someting like this:

r = requests.get(url, timeout=60)
q = Image.open(BytesIO(r.content))
fp = os.path.join([args["output"], f"{str(total).zfill(5)}.jpeg"]) # here i used f-string because it looks more compact 
q = q.save(fp)

Image.save() described here
F-strings it's way of formatting, it described here and here

I hope that it's helpful, have a good day!

EDIT: Ok it looks like it doesn't work So, you can try this from here:

r = requests.get(url, timeout=60)

bytes = BytesIO(r.content)
bytes.seek(0)
q = Image.ope(bytes)

fp = os.path.join([args["output"], f"{str(total).zfill(5)}.jpeg"]) # here i used f-string because it looks more compact 
q = q.save(fp)

diduk001
  • 202
  • 1
  • 11
  • I got this error `"cannot identify image file %r" % (filename if filename else fp) PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7febf9fcfa98>` at the line q=Image.open(Byt...). – Hrushi Apr 12 '20 at 11:43
  • For a strange reason, I get the same error. Is there any other way to extract the image without using requests – Hrushi Apr 12 '20 at 15:34
  • 1
    @Hrushi I don't know but I am sure that you can show image with using `Image.open(BytesIO(response.content)).show()`, because I've done that some time ago. Actually, I don't know how to do this without requests, but I think if you can show it, also you can save it in a folder. Maybe [this](https://stackoverflow.com/q/31077366/13283436) can help you. – diduk001 Apr 12 '20 at 15:41