4

I am trying to read a CSV file on the SFTP in my Python memory. I tried the following, which works fine for a FTP connection, but not for a SFTP.

E.g., I want to replicate:

df = pd.read_csv(...)

But without storing it first locally (reason being is because I want to run it as a Cloud Function and then I don't want local files in my cache).

How can I do it differently?

def read_file_sftp_local_memory(sftp, path, filename):

    flo = BytesIO()
    path_query = "".join(['RETR ', path, '/', filename])
    sftp.retrbinary(path_query, flo.write)
    flo.seek(0)
    return flo

I tried also the following:

def read_file_csv(sftp, path, filename):

    # Download
    sftp.get("/".join( os.path.join(path, filename) ), filename)

    # Read
    df = pd.read_csv(filename)

    # Delete
    os.remove(filename)

    # Return
    return df

But this error is returned:

raise IOError(text)
OSError: Failure
Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
WJA
  • 6,676
  • 16
  • 85
  • 152

1 Answers1

8

Assuming you are using Paramiko SFTP library, use SFTPClient.open method:

with sftp.open(path) as f:
    f.prefetch()
    df = pd.read_csv(f)

For the purpose of the prefetch, see Reading file opened with Python Paramiko SFTPClient.open method is slow.

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992