2

I know how to list all files in a directory from a FTP server:

import ftplib
ftp = ftplib.FTP()
ftp.connect("192.168.1.18", port=2240)
ftp.login()
ftp.cwd('path/to')
for f in ftp.mlsd():
    print(f)

But what's the best way to obtain a recursive filelist (i.e. files in subdirectories, subsubdirectories, etc.)?

i.e. an equivalent for FTP of Python 3's glob.glob('path/to/**/*', recursive=True) which lists all files recursively.

I could do it by entering manually each dir, and then redo a msld() but I fear this will be very slow (listing files in FTP is already slow as far as I remember), so this is not optimal.

How would one do it with SFTP? Would it be easier with SFTP to list all files recursively?

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
Basj
  • 41,386
  • 99
  • 383
  • 673

2 Answers2

3

I could do it by entering manually each dir, and then redo a msld()

And that's the correct solution. FTP protocol does not have any better standard way to retrieve recursive listing. So there's no space for optimization (only by parallelizing the operation). See also Downloading a directory tree with ftplib.

Some FTP servers support non-standard -R switch with some file listing commands (not sure about MLSD). So if you are willing to rely on a non-standard functionality and your particular server supports it, you can optimize your code this way. See also Getting all FTP directory/file listings recursively in one call.

For SFTP, see Recursive SFTP listdir in Python?

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
2

As LIST -R, NLST -R, MLSD -R were not working for me, I following @MartinPrikryl's recommendation and here is a FTP solution:

import ftplib, time
def list_recursive(ftp, remotedir):
    ftp.cwd(remotedir)
    for entry in ftp.mlsd():
        if entry[1]['type'] == 'dir':
            remotepath = remotedir + "/" + entry[0]
            print(time.time() - t0, remotepath)
            list_recursive(ftp, remotepath)
        else:
            print(entry)
ftp = ftplib.FTP()
ftp.connect("192.168.1.18", port=2240)
ftp.login()
t0 = time.time()
list_recursive(ftp, '/sdcard/music')

It took 344 seconds for ~20k files in ~900 folders (my FTP server is on a phone: cx File Explorer app).


As a comparison, here is a solution for SFTP:

import pysftp
def list_recursive(sftp, remotedir):
    for entry in sftp.listdir_attr(remotedir):
        remotepath = remotedir + "/" + entry.filename
        if sftp.isdir(remotepath):
            print(remotepath)            
            list_recursive(sftp, remotepath)
        else:
            print(entry.st_size, entry.st_mtime, entry.filename)
cnopts = pysftp.CnOpts()  # for local testing
cnopts.hostkeys = None 
with pysftp.Connection('192.168.1.18', port=2222, username='ssh', password='', cnopts=cnopts) as sftp:
    list_recursive(sftp, 'music')

It took 222 seconds for ~20k files in ~900 folders (I used SSH/SFTP Server app on an Android phone).

Basj
  • 41,386
  • 99
  • 383
  • 673
  • 1
    +1 – Though one obligatory warning: Do not set `cnopts.hostkeys = None`, unless you do not care about security. See [Verify host key with pysftp](https://stackoverflow.com/q/38939454/850848). – Martin Prikryl Dec 03 '20 at 12:02