How can I get the file size from a link without downloading it in python?

Question

I have a list of links that I am trying to get the size of to determine how much computational resources each file need. Is it possible to just get the file size with a get request or something similar?

Here is an example of one of the links: https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887

Thanks

You can take a look [here](https://stackoverflow.com/questions/5909/get-size-of-a-file-before-downloading-in-python). — Vasilis G., Mar 18 '19 at 16:59

score 9 · Accepted Answer · answered Mar 18 '19 at 17:11

To do this use the HTTP HEAD method which just grabs the header information for the URL and doesn't download the content like an HTTP GET request does.

$curl -I https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887
HTTP/1.1 200 OK
Server: nginx
Date: Mon, 18 Mar 2019 16:56:35 GMT
Content-Type: application/octet-stream
Content-Length: 578220087
Last-Modified: Tue, 21 Feb 2017 12:13:19 GMT
Connection: keep-alive
Accept-Ranges: bytes

The file size is in the 'Content-Length' header. In Python 3.6:

>>> import urllib
>>> req = urllib.request.Request('https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887', 
                                 method='HEAD')
>>> f = urllib.request.urlopen(req)
>>> f.status
200
>>> f.headers['Content-Length']
'578220087'

note if the remote server does not implement head you can still achieve something similar by using the stream = True option with the python requests library as on https://stackoverflow.com/a/44299915 and then closing each request directly after you have obtained their headers. — Maarten Derickx, Dec 21 '20 at 16:42

ccpizza · Answer 2 · 2021-09-29T18:55:56.533

You need to use the HEAD method. The example uses requests (pip install requests).

#!/usr/bin/env python
# display URL file size without downloading

import sys
import requests

# pass URL as first argument
response = requests.head(sys.argv[1], allow_redirects=True)

size = response.headers.get('content-length', -1)

# size in megabytes (Python 2, 3)
print('{:<40}: {:.2f} MB'.format('FILE SIZE', int(size) / float(1 << 20)))

# size in megabytes (f-string, Python 3 only)
# print(f"{'FILE SIZE':<40}: {int(size) / float(1 << 20):.2f} MB")

Also see How do you send a HEAD HTTP request in Python 2? if you need a standard-library based solution.

score 1 · Answer 3 · answered Mar 18 '19 at 17:03

If you're using Python 3, you can do it using urlopen from urllib.request:

from urllib.request import urlopen
link =  "https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887"
site = urlopen(link)
meta = site.info()
print(meta)

This will output:

Server: nginx
Date: Mon, 18 Mar 2019 17:02:40 GMT
Content-Type: application/octet-stream
Content-Length: 578220087
Last-Modified: Tue, 21 Feb 2017 12:13:19 GMT
Connection: close
Accept-Ranges: bytes

The Content-Length property is the size of your file in bytes.

`urlopen` will do a `GET` request and will in fact download the document. — ccpizza, Mar 18 '19 at 17:26

How can I get the file size from a link without downloading it in python?

3 Answers3