1

I am using pysftp to transfer files over network. I want to ensure reliability in this process. There's a put API in pysftp and it returns a SFTPAttributes object containing size of the transferred files. Is that enough to verify a successful transfer, I mean by comparing the size.

Or... it seems to me that I can also open the transferred file and use check method... Is it the correct way? I am just a little confused. Thank you.

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992

2 Answers2

1

Pysftp Connection.put already checks the size of the uploaded file. That's what the confirm=True parameter is for, You do not need to do anything more:

whether to do a stat() on the file afterwards to confirm the file size

While you can theoretically verify the checksum with SFTPFile.check, it won't typically work, as most SFTP servers, including the widespread OpenSSH, do not support calculating checksums. So the call will fail. You would have to resort to running some shell command to calculate the checksum. See:
Comparing MD5 of downloaded files against files on an SFTP server in Python

But it's questionable whether it is worth the effort, see:
How to perform checksums during a SFTP file transfer for data integrity?


Though these days, you should not use pysftp, as it is dead. Use Paramiko directly instead. See pysftp vs. Paramiko. See basically the same question about Paramiko: How to check if Paramiko successfully uploaded a file to an SFTP server?

Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
  • 1
    Solve my problem. I like the part "questionable whether it is worth..". Maybe it is not necessary cuz ssh and tcp do the thing. And thank you for the reminder of using Paramiko. – zhipeng wang Jun 30 '22 at 09:59
0

While checking the file size is already a good indicator for a successful transfer, it will not tell you if a bit has flipped and therefore the remote file does not have exactly the same content. Therefore, you should use a checksum, e.g. provided by a hashing algorithm like md5, see https://en.wikipedia.org/wiki/MD5.

Basically this is what the check method (https://docs.paramiko.org/en/latest/api/sftp.html?highlight=paramiko.sftp_file.sftpfile#paramiko.sftp_file.SFTPFile.check) you mentioned will compute. Then you must compute the same checksum for your local file and compare.

Carlos Horn
  • 1,115
  • 4
  • 17