0

I'm trying to load a .csv file stored on a FTP Server (SFTP protocol). I'm using Python in combination with pysftp library. On the FTP server, the CSV file is inside a .zip file. Is there a way to open the zip and then retrieve only the csv file inside it?

Thank you in advance,

import pysftp

cnopts = pysftp.CnOpts()
cnopts.hostkeys = None

# Make connection to sFTP
with pysftp.Connection(hostname,
                       username=sftp_username,
                       password=sftp_pw,
                       cnopts = cnopts
                       ) 
with pysftp.cd(download_directory):
        with sftp.cd('download_directory'):
            print(f'Downloading this file: {filename}')
            sftp.get(filename, preserve_mtime=True)
    sftp.close()
Imerial
  • 25
  • 4
  • Just for clarity... there are other files in that .zip? – tdelaney Feb 27 '20 at 08:11
  • 1
    Beware: SFTP and FTP are **very** different protocols. But it does not matter for your question: out of the box, I know neither FTP nor SFTP server allowing to extract elements from a ZIP archive on the server side. But a dedicated service could... – Serge Ballesta Feb 27 '20 at 08:12
  • If you have sftp, you probably have ssh too. How about running a remote unzip of that one file and the return stream is the csv you want? – tdelaney Feb 27 '20 at 08:14
  • Are there other files in the ZIP file? => Is the point of the question to avoid downloading a whole ZIP file, if you want only one tiny CSV file (comparing to a huge ZIP file)? That is not that impossible as @Serge claims (you indeed have to extract the file *locally*, but you do not have to download whole ZIP for that). It's definitely possible with SFTP to download only part of a (ZIP) file. And also with *some* FTP servers. So it matters, if it is SFTP or FTP. – Martin Prikryl Feb 27 '20 at 08:30
  • For a similar (FTP) question, see [Get files names inside a zip file on FTP server without downloading whole archive](https://stackoverflow.com/q/53143518/850848). – Martin Prikryl Feb 27 '20 at 08:31
  • Thanks for everybody who responded. I'm sorry for not being clear about this. It is FTP protocol. Perhaps my entire code is wrong since I thought it was sftp. What I'm working on is a data feed where I pull a CSV file from inside a .zip in a FTP server, and populate my database with the information from the CSV file. @tdelaney, do you have an example of how to do the remote unzip? – Imerial Feb 27 '20 at 18:23

1 Answers1

0

If you have ssh access to the remote host and know enough about the remote path to the zip file you want and the zip utilities on that host, you can use your ssh client to run the unzip command remotely and capture its output. Here, my target is a linux machine and the zipfile is in the login user's home directory path. I can use the paramiko ssh client to do the work

Its a good idea to log into the remote server via ssh and practice to see what the path structure is like

import sys
import paramiko
import shutil

def sshclient_exec_command_binary(sshclient, command, bufsize=-1,
    timeout=None, get_pty=False):
    """Paramiko SSHClient helper that implements exec_command with binary
    output.
    """
    chan = sshclient._transport.open_session()
    if get_pty:
        chan.get_pty()
    chan.settimeout(timeout)
    chan.exec_command(command)
    stdin = chan.makefile('wb', bufsize)
    stdout = chan.makefile('rb', bufsize)
    stderr = chan.makefile_stderr('rb', bufsize)
    return stdin, stdout, stderr

# example gets user/pw from command line
if len(sys.argv) != 3:
    print("usage: test.py username password")
    exit(1)
username, password = sys.argv[1:3]

# put your host/file info here
hostname = "localhost"
remote_zipfile = "tmp/mytest.zip"
file_to_extract = "myfile"

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(hostname, username=username, password=password)
unzip_cmd = "unzip -p {} {}".format(remote_zipfile, file_to_extract)
print("running", unzip_cmd)
stdin, out, err = sshclient_exec_command_binary(ssh, unzip_cmd)

# if the command worked, out is a file-like object to read.
print("writing", file_to_extract)
with open(file_to_extract, 'wb') as out_fp:
    shutil.copyfileobj(out, out_fp)
tdelaney
  • 73,364
  • 6
  • 83
  • 116