1

I want to read multi big files that exist on centos server with python.I wrote a simple code for that and it's worked but entire file came to a paramiko object (paramiko.sftp_file.SFTPFile) after that I can process line. it has not good performance and I want process file and write to csv piece by piece because process entire file can affect performance. Is there a way to solve the problem?

ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
ssh.connect(host, port, username, password)

sftp_client = ssh.open_sftp()
remote_file = sftp_client.open(r'/root/bigfile.csv')

try:
    for line in remote_file:
        #Proccess
finally:
    remote_file.close()
Nestor
  • 41
  • 1
  • 5

1 Answers1

0

Here could solve your problem.

 def lazy_loading_ftp_file(sftp_host_conn, filename):
    """
        Lazy loading ftp file when exception simple sftp.get call
        :param sftp_host_conn: sftp host
        :param filename: filename to be downloaded
        :return: None, file will be downloaded current directory
    """
    import shutil
    try:
        with sftp_host_conn() as host:
            sftp_file_instance = host.open(filename, 'r')
            with open(filename, 'wb') as out_file:
                shutil.copyfileobj(sftp_file_instance.raw, out_file)
            return {"status": "sucess", "msg": "sucessfully downloaded file: {}".format(filename)}
    except Exception as ex:
        return {"status": "failed", "msg": "Exception in Lazy reading too: {}".format(ex)}

This will avoid reading the whole thing into memory at once.

Ramsey
  • 103
  • 7