-1

I wrote a program using pysftp to download the file from Google Cloud Storage blob, and then upload the file from the file system. I wondered if I could bypass the file system and upload the stream to SFTP.

I am using Google Cloud Functions to run my program and the file system is read-only. So I can't write to disk. Also, it would be much faster to transfer data as it avoids the step of writing and reading from the disk.

for blob in storage_client.list_blobs(bucket, prefix=prefix):
        source = blob.name
        destination = local_download_dir + "/" + remove_prefix(blob.name, prefix)
        blob.download_to_filename(destination)

...

with pysftp.Connection(Config.SFTP_HOST, port=Config.SFTP_PORT, username=Config.SFTP_USER, password=Config.SFTP_PWD, cnopts=cnopts) as sftp:

...
files = listdir(local_download_dir)       
for f in files:
  sftp.put(local_download_dir + "/" + f)  # upload file to remote

Titu
  • 176
  • 7

1 Answers1

-1

Supporting the community by answering my own question. Hope some of you find it useful.

I initially tried the following, and it worked but may lead to memory issues for big files:

from io import BytesIO
sftp.putfo(BytesIO(blob.download_as_bytes()), destination) 

Then found a better approach with blob.open:

with blob.open("rb") as f:
    sftp.putfo(f, destination) 

In stream mode chunk_size: 40MB default

Titu
  • 176
  • 7
  • It was downvoted by someone else thinking I copied it. I did not copy it. Coincidentally someone else had the same issue when I was working on this problem. The solution works. The other question: https://stackoverflow.com/q/73230723/850848 – Titu Aug 05 '22 at 17:34