116

I got a problem when I am using python to save an image from url either by urllib2 request or urllib.urlretrieve. That is the url of the image is valid. I could download it manually using the explorer. However, when I use python to download the image, the file cannot be opened. I use Mac OS preview to view the image. Thank you!

UPDATE:

The code is as follow

def downloadImage(self):
    request = urllib2.Request(self.url)
    pic = urllib2.urlopen(request)
    print "downloading: " + self.url
    print self.fileName
    filePath = localSaveRoot + self.catalog  + self.fileName + Picture.postfix
    # urllib.urlretrieve(self.url, filePath)
    with open(filePath, 'wb') as localFile:
        localFile.write(pic.read())

The image URL that I want to download is http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg

This URL is valid and I can save it through the browser but the python code would download a file that cannot be opened. The Preview says "It may be damaged or use a file format that Preview doesn't recognize." I compare the image that I download by Python and the one that I download manually through the browser. The size of the former one is several byte smaller. So it seems that the file is uncompleted, but I don't know why python cannot completely download it.

Shaoxiang Su
  • 1,191
  • 2
  • 8
  • 7
  • Why can't it be opened? What error do you get? What does ``file `` tell you? Did the file download correctly or were you blocked by ``User-Agent`` or ``Cookie`` restrictions or similar? – James Mills May 14 '15 at 04:31
  • 1
    Include the python code you are trying in the question please – Tom McClure May 14 '15 at 04:32
  • Sorry for the confusing. I have provided more details. Thanks a lot. I wonder if it is because the HTTP request in python is different with downloading by a browser so python cannot bring me a completed image file. – Shaoxiang Su May 14 '15 at 06:50
  • It seems that requests is a much better module than urllib and urllib2 – Shaoxiang Su May 14 '15 at 08:15

10 Answers10

203
import requests

img_data = requests.get(image_url).content
with open('image_name.jpg', 'wb') as handler:
    handler.write(img_data)
Vlad Bezden
  • 83,883
  • 25
  • 248
  • 179
  • 4
    @vlad what if we are not aware of the image extension from the URL but we know it is an image? – Mona Jalal Apr 02 '18 at 03:02
  • 2
    @MonaJalal you don't have to specify an extension, as long as you have valid qualified URL address. – Vlad Bezden Apr 02 '18 at 11:14
  • 4
    `pip install requests` if you don't have – devugur Jan 15 '21 at 10:53
  • 1
    Using '.content' after requests.get() is the key to save an image – Felipe Toledo Jun 24 '21 at 23:24
  • 2
    It does not work for the following URL; any idea how to fix it? https://www.genome.jp/pathway/ko02024+K07173 – Cleb Oct 17 '21 at 20:03
  • @VladBezden - This saves the image to a folder, but when I open the image it says that windows does not support the file format, despite it being a simple jpg. Do you know why? – Parseval Apr 13 '22 at 09:37
  • When downloading a webp, the files are corrupted somehow. Using ffprobe, I am told `missing RIFF tag` and `Could not find codec parameters for stream 0 (Video: webp, none):` – scrollout Aug 06 '22 at 04:04
  • Note that this downloads the whole image to memory first and then writes it to a file. If you want to stream the data directly to a file use e.g. [this](https://stackoverflow.com/a/59549993/3782904) answer – thi gg Dec 04 '22 at 12:17
91

A sample code that works for me on Windows:

import requests

with open('pic1.jpg', 'wb') as handle:
    response = requests.get(pic_url, stream=True)

    if not response.ok:
        print(response)

    for block in response.iter_content(1024):
        if not block:
            break

        handle.write(block)
DeepSpace
  • 78,697
  • 11
  • 109
  • 154
34

It is the simplest way to download and save the image from internet using urlib.request package.

Here, you can simply pass the image URL(from where you want to download and save the image) and directory(where you want to save the download image locally, and give the image name with .jpg or .png) Here I given "local-filename.jpg" replace with this.

Python 3

import urllib.request
imgURL = "http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg"

urllib.request.urlretrieve(imgURL, "D:/abc/image/local-filename.jpg")

You can download multiple images as well if you have all the image URLs from the internet. Just pass those image URLs in for loop, and the code automatically download the images from the internet.

Ankit Lad
  • 369
  • 3
  • 5
  • 2
    I tried this but I get an error: HTTPError: Forbidden. Do you know why this is? I'm using this URL: http://assets.ellosgroup.com/i/ellos/ell_1682670-01_Fs. – Parseval Apr 13 '22 at 09:46
  • 1
    @Parseval, adding this code fixed for me (agent was needed) ``` import urllib.request as urlopen opener = urlopen.build_opener() opener.addheaders = [('User-Agent', 'Chrome')] urlopen.install_opener(opener)` – Daniel Danielecki Apr 20 '23 at 18:04
18

Python code snippet to download a file from an url and save with its name

import requests

url = 'http://google.com/favicon.ico'
filename = url.split('/')[-1]
r = requests.get(url, allow_redirects=True)
open(filename, 'wb').write(r.content)
Basil Jose
  • 1,004
  • 11
  • 13
7
import random
import urllib.request

def download_image(url):
    name = random.randrange(1,100)
    fullname = str(name)+".jpg"
    urllib.request.urlretrieve(url,fullname)     
download_image("http://site.meishij.net/r/58/25/3568808/a3568808_142682562777944.jpg")
mdaniel
  • 31,240
  • 5
  • 55
  • 58
learner
  • 87
  • 1
  • 2
  • 2
    Welcome to Stackoverflow and thanks for your contribution! Could you add an explanation to your answer what the code does and why it works? Thanks! – Max Vollmer Sep 09 '18 at 14:40
  • How do I add the headers for url in urlretrieve? I had a problem with images opening in the browser but not through code using urlretrive. I have tried urlopen but I don't know how to download the image using urlopen. – Eswar Mar 27 '19 at 14:38
3

You can pick any arbitrary image from Google Images, copy the url, and use the following approach to download the image. Note that the extension isn't always included in the url, as some of the other answers seem to assume. You can automatically detect the correct extension using imghdr, which is included with Python 3.9.

import requests, imghdr

gif_url = 'https://media.tenor.com/images/eff22afc2220e9df92a7aa2f53948f9f/tenor.gif'
img_url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQwXRq7zbWry0MyqWq1Rbq12g_oL-uOoxo4Yw&usqp=CAU'
for url, save_basename in [
    (gif_url, 'gif_download_test'),
    (img_url, 'img_download_test')
]:
    response = requests.get(url)
    if response.status_code != 200:
        raise URLError
    extension = imghdr.what(file=None, h=response.content)
    save_path = f"{save_basename}.{extension}"
    with open(save_path, 'wb') as f:
        f.write(response.content)
  • This seems like the most upvoted answer, except with extra steps, doesn't it? – Dominik Stańczak Jul 06 '22 at 11:49
  • 2
    The extra step is there to determine the correct file extension. The most upvoted answer doesn't do this. People were asking why the most upvoted answer doesn't work for all image urls. It's because you can't always assume that the image is jpg. You can't save a jpg image as png, and you can't save a png image as jpg. This is a problem when you don't know the correct extension beforehand. As an example, try downloading this image and see what happens: https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTJT899RTIAkN3W4lte7HjXzt9sbeviVdNaiA&usqp=CAU – Clayton Mork Jul 12 '22 at 05:24
  • Fair point, that makes sense. Thanks! – Dominik Stańczak Jul 12 '22 at 05:38
2

For linux in case; you can use wget command

import os
url1 = 'YOUR_URL_WHATEVER'
os.system('wget {}'.format(url1))
Vicrobot
  • 3,795
  • 1
  • 17
  • 31
  • That gives me an empty image for the following URL: https://www.genome.jp/pathway/ko02024+K07173 Any idea how to fix this? – Cleb Oct 17 '21 at 19:51
  • 3
    @Cleb That's because the url you provided doesn't belong to an image. Try it with ```url1 = 'https://www.genome.jp/tmp/mark_pathway1641220140108369/ko02024.png'``` in this case – RAZ0229 Jan 03 '22 at 14:32
1

Anyone who is wondering how to get the image extension then you can try split method of string on image url:

str_arr = str(img_url).split('.')
img_ext = '.' + str_arr[3] #www.bigbasket.com/patanjali-atta.jpg (jpg is after 3rd dot so)
img_data = requests.get(img_url).content
with open(img_name + img_ext, 'wb') as handler:
    handler.write(img_data)
Ssubrat Rrudra
  • 870
  • 8
  • 20
1

download and save image to directory

import requests

headers = {"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0",
           "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
           "Accept-Language": "en-US,en;q=0.9"
           }

img_data = requests.get(url=image_url, headers=headers).content
with open(create_dir() + "/" + 'image_name' + '.png', 'wb') as handler:
    handler.write(img_data)

for creating directory

def create_dir():
    # Directory
    dir_ = "CountryFlags"
    # Parent Directory path
    parent_dir = os.path.dirname(os.path.realpath(__file__))
    # Path
    path = os.path.join(parent_dir, dir_)
    os.mkdir(path)
    return path
zaheer
  • 143
  • 10
0

if you want to stick to 2 lines? :

with open(os.path.join(dir_path, url[0]), 'wb') as f:
    f.write(requests.get(new_url).content)
Hamza Rashid
  • 1,329
  • 15
  • 22
Spinstaz
  • 287
  • 6
  • 12