90

I'm trying to download some public data files. I screenscrape to get the links to the files, which all look something like this:

ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/nhanes/2001-2002/L28POC_B.xpt

I can't find any documentation on the Requests library website.

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
user1507455
  • 1,083
  • 2
  • 10
  • 10

9 Answers9

121

The requests library doesn't support ftp:// links.

To download a file from an FTP server you could use urlretrieve:

import urllib.request

urllib.request.urlretrieve('ftp://server/path/to/file', 'file')
# if you need to pass credentials:
#   urllib.request.urlretrieve('ftp://username:password@server/path/to/file', 'file')

Or urlopen:

import shutil
import urllib.request
from contextlib import closing

with closing(urllib.request.urlopen('ftp://server/path/to/file')) as r:
    with open('file', 'wb') as f:
        shutil.copyfileobj(r, f)

Python 2:

import shutil
import urllib2
from contextlib import closing

with closing(urllib2.urlopen('ftp://server/path/to/file')) as r:
    with open('file', 'wb') as f:
        shutil.copyfileobj(r, f)
Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • 2
    Thank you for this, but how can you provide credentials? – SSH This Feb 14 '14 at 17:25
  • 4
    @SSHThis: try: `'ftp://username:password@server/path/to/file'` or use [@Rakesh's answer](http://stackoverflow.com/a/12424311/4279). If you can't make it to work, [ask](http://stackoverflow.com/questions/ask). – jfs Feb 14 '14 at 17:46
  • 3
    A bit of urlib vs. requests information here: http://www.blog.pythonlibrary.org/2012/06/07/python-101-how-to-download-a-file/ – cbare Aug 06 '14 at 23:09
  • 2
    @cbare: what is the point of the link. Does `requests` support `ftp` at all? – jfs Nov 03 '15 at 18:08
  • @J.F. Sebastian: the article shows a side-by-side comparison of `requests` and `urllib`. Maybe that's helpful to someone switching from `requests` to `urllib` to gain FTP support? Or do you think it's better to delete the comment? – cbare Nov 04 '15 at 22:26
  • @cbare: It's up to you – jfs Nov 04 '15 at 23:45
  • @J.F.Sebastian Any chance we could get an update? This doesn't work in Py3 and ftblib isn't cutting it – zelusp Oct 08 '16 at 14:27
  • @zelusp if replacing `import urllib2` with `import urllib.request` "doesn't work" for you then ask a separate question (provide specific errors that you get). It might be worth it to ask a new question anyway with your specific requirements (describe in the question how exactly "ftplib isn't cutting it": what do you expect to happen? What happens instead? (in detail)). – jfs Oct 08 '16 at 14:55
  • @J.F.Sebastian, You're completely right - it was really late and I just wanted a magical answer. But maybe I'll solve my problem now that I've slept - thanks for reply! – zelusp Oct 09 '16 at 20:53
  • If downloading multiple FTP files, "IOError: [Errno ftp error] 200 Switching to Binary mode." would be thrown when using "urllib.urlretrieve". – alextc Jan 04 '18 at 02:32
  • I realise this is a really old post, sorry for bumping it. But in your answer to @SSHThis `'ftp://username:password@server/path/to/file'`, is the username and password encrypted or "hidden", or can anyone who sees your traffic read it just like they could read a url? Sorry I don't know too much about encryption and how all this works but I was curious. – Marses Jan 12 '18 at 11:18
  • 1
    @LimokPalantaemon it is equivalent to `ftp.login(user, passw)` call and therefore it is not encrypted (ftp is a very old protocol—little security). You could try sftp instead (fabric/paramiko). – jfs Jan 12 '18 at 11:59
  • Wow, I'm amazed that it takes three modules and six lines to download a file... Thanks so much for your Python3 answer - I was getting desparate after trying `urllib`, `requests` and `wget` (the latter can't overwrite existing files...) – AstroFloyd Oct 22 '19 at 10:22
  • Why do you use `contextlib.closing` when `urllib.response.addinfourl.__exit__` already calls `self.close()`? – gerrit Jan 15 '20 at 13:58
  • @gerrit: my guess is that the response object didn't support the context manager protocol in 2012. – jfs Jan 15 '20 at 17:40
  • why is `shutil` needed with `urilib.request`? – william_grisaitis Jun 15 '21 at 22:04
  • 1
    @grisaitis it is just a loop: `while data := r.read(blocksize): f.write(data)` (copy data from the input file object `r` to the output file `f`) – jfs Jun 15 '21 at 23:34
  • How would I go about downloading the files to save them locally? – Jsleshem Sep 17 '21 at 16:16
  • 1
    @Jsleshem: you are commenting on the answer that does exactly that. – jfs Sep 18 '21 at 13:15
71

You Can Try this

import ftplib

path = 'pub/Health_Statistics/NCHS/nhanes/2001-2002/'
filename = 'L28POC_B.xpt'

ftp = ftplib.FTP("Server IP") 
ftp.login("UserName", "Password") 
ftp.cwd(path)
ftp.retrbinary("RETR " + filename, open(filename, 'wb').write)
ftp.quit()
Aidas Bendoraitis
  • 3,965
  • 1
  • 30
  • 45
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • What if filename on server have some special characters e.g. ' ', $, & etc. Do I need to escape them? – Dilawar Sep 09 '15 at 12:30
  • The fiilename can be an arbitrary byte sequence with a few exceptions such as `b'\xff'` (I don't know any standard way to escape such names). Here's [more detail (in Russian)](http://ru.stackoverflow.com/a/523291/23044). You could ask a separate Stack Overflow question if you have a specific issue with ftp filenames – jfs Oct 01 '16 at 22:01
  • 1
    Encoding the filename from unicode to "utf-8" worked for me. Maybe that differs on different OS: `ftp.retrbinary(u"RETR täßt.jpg".encode('utf-8'), open('local.jpg', 'wb').write)` – Aidas Bendoraitis Apr 26 '17 at 14:50
  • If the data returned is larger than the blocksize I believe this will continue to overwrite the file and keep only the last block. – mgilbert May 15 '19 at 01:51
  • 1
    How can I specify which directory on the local machine to send it to? – opperman.eric Dec 20 '21 at 23:48
18

Try using the wget library for python. You can find the documentation for it here.

import wget
link = 'ftp://example.com/foo.txt'
wget.download(link)
wordsforthewise
  • 13,746
  • 5
  • 87
  • 117
Gaurav Shrivastava
  • 905
  • 12
  • 19
  • 4
    Simplest and works well. You can also set the filename with the `out` parameter in wget.download. – wordsforthewise Sep 24 '19 at 15:51
  • 1
    This works for me and other method caused file crushed. – Samoth Oct 14 '19 at 01:52
  • @anatoly-techtonik I think you're the author of this pypi module. Would you consider it safe to use? – yzorg Sep 18 '21 at 17:11
  • 1
    Caution: no release since 2015, and Homepage link on pypi is broken link (to bitbucket). Author's other projects moved to github, but I don't see this one. https://github.com/techtonik – yzorg Sep 18 '21 at 17:11
7

Use urllib2. For more specifics, check out this example from doc.python.org:

Here's a snippet from the tutorial that may help

import urllib2

req = urllib2.Request('ftp://example.com')
response = urllib2.urlopen(req)
the_page = response.read()
Parker
  • 8,539
  • 10
  • 69
  • 98
7
    import os
    import ftplib
    from contextlib import closing

    with closing(ftplib.FTP()) as ftp:
        try:
            ftp.connect(host, port, 30*5) #5 mins timeout
            ftp.login(login, passwd)
            ftp.set_pasv(True)
            with open(local_filename, 'w+b') as f:
                res = ftp.retrbinary('RETR %s' % orig_filename, f.write)

                if not res.startswith('226 Transfer complete'):
                    print('Downloaded of file {0} is not compile.'.format(orig_filename))
                    os.remove(local_filename)
                    return None

            return local_filename

        except:
                print('Error during download from FTP')
Roman Podlinov
  • 23,806
  • 7
  • 41
  • 60
  • I have a completely unrelated question to this thread but related to your code uploaded on github: http://stackoverflow.com/questions/27584233/sliding-window-how-to-get-window-location-on-image – user961627 Dec 20 '14 at 20:58
4

As several folks have noted, requests doesn't support FTP but Python has other libraries that do. If you want to keep using the requests library, there is a requests-ftp package that adds FTP capability to requests. I've used this library a little and it does work. The docs are full of warnings about code quality though. As of 0.2.0 the docs say "This library was cowboyed together in about 4 hours of total work, has no tests, and relies on a few ugly hacks".

import requests, requests_ftp
requests_ftp.monkeypatch_session()
response = requests.get('ftp://example.com/foo.txt')
Nelson
  • 27,541
  • 5
  • 35
  • 31
  • This solution works, in my hands at least, as `s = requests.Session()` `response = s.get(...` (not as `requests.get`) – Matteo Ferla Dec 16 '19 at 13:42
4

If you want to take advantage of recent Python versions' async features, you can use aioftp (from the same family of libraries and developers as the more popular aiohttp library). Here is a code example taken from their client tutorial:

client = aioftp.Client()
await client.connect("ftp.server.com")
await client.login("user", "pass")
await client.download("tmp/test.py", "foo.py", write_into=True)
Dean Gurvitz
  • 854
  • 1
  • 10
  • 24
2

urllib2.urlopen handles ftp links.

sloth
  • 99,095
  • 21
  • 171
  • 219
Victor Gavro
  • 1,347
  • 9
  • 13
  • 1
    For those new-ish to Python: Was renamed back to just urllib, still supports FTP. Basically see the top answer. – yzorg Sep 18 '21 at 17:37
1

urlretrieve is not work for me, and the official document said that They might become deprecated at some point in the future.

import shutil 
from urllib.request import URLopener
opener = URLopener()
url = 'ftp://ftp_domain/path/to/the/file'
store_path = 'path//to//your//local//storage'
with opener.open(url) as remote_file, open(store_path, 'wb') as local_file:
    shutil.copyfileobj(remote_file, local_file)
GoatWang
  • 105
  • 1
  • 6