3

I copied the example code from the https://asyncssh.readthedocs.io/en/latest/#sftp-client website and changed it slightly to work for my requirement.

I was able to connect to the SFTP site and download the files from the "/Exports" folder, but it seemed like the files were download one by one instead of multiple files at a time.

My code:

import asyncio
import asyncssh
import sys


async def run_client():
    async with asyncssh.connect(host=host, username=username, password=password, port=port_no, known_hosts=None) as conn:
        async with conn.start_sftp_client() as sftp:

            await sftp.get(
                remotepaths='/Exports',
                localpath=r'Path on my local machine',
                preserve=True,
                recurse=True,
                max_requests=128
            )

try:
    asyncio.get_event_loop().run_until_complete(run_client())
except (OSError, asyncssh.Error) as exc:
    sys.exit('SFTP operation failed: ' + str(exc))

I need to download 9000 files that are mostly 1KB. I can definitely see the files downloading one by one. Any idea what the issue can be?

JM Nel
  • 53
  • 1
  • 5
  • I do not think that asynchronous = parallel. I believe the code does what it should. If you need parallel downloads, you need to run multiple parallel operations. – Martin Prikryl Sep 23 '20 at 14:47

1 Answers1

3

Here is a solution to download files asynchronously.

import asyncio

import asyncssh


async def download_file(sftp, file: str, localdir: str):
    await sftp.get(file, localpath=f"{localdir}/{file}")


async def run_client():
    async with asyncssh.connect(
        "host", username="username", password="password"
    ) as conn:
        async with conn.start_sftp_client() as sftp:
            files = await sftp.glob("/Exports/*")
            tasks = (download_file(sftp, file, localdir="/") for file in files)
            await asyncio.gather(*tasks)


asyncio.run(run_client())

  • Nice solution, but how do you limit the amount of asyncronized requests? would be nice to sleep x seconds per n requests. gather 50 sleep 1 sec. gather next 50 etc.. – Paal Pedersen Mar 02 '21 at 10:41
  • @PaalPedersen A neat method to limit concurrent requests with asyncio is to use a semaphore which each client has to wait for before proceeding. You can find an example (and a short snippet below) at the following question: https://stackoverflow.com/a/48486557/7583539 – Septatrix Nov 03 '21 at 20:27