Backstory is im trying to pull some data from an ftp login I was given. This data constantly gets updated, about daily, and I believe they wipe the ftp at the end of each week or month. I was thinking about inputting a date and having the script run daily to see if there are any files that match the date, but if the servers time isn't accurate it could cause data loss. For now I just want to download ALL the files, and then ill work on fine-tuning it.
I haven't worked much with coding ftp before, but seems simple enough. However, the problem I'm having is small files get downloaded without a problem and their file sizes check out and match. When it tries to download a big file that would normally take a few minutes, it gets to a certain point (almost completing the file) and then it just stops and the script hangs.
For Example:
It tries to download a file that is 373485927 bytes in size. The script runs and downloads that file up until 373485568 bytes. It ALWAYS stops at this amount after trying different methods and changing some code.
Don't understand why it always stops at this byte and why it would work fine with smaller files (1000 bytes and under).
import os
import sys
import base64
import ftplib
def get_files(ftp, filelist):
for f in filelist:
try:
print "Downloading file " + f + "\n"
local_file = os.path.join('.', f)
file = open(local_file, "wb")
ftp.retrbinary('RETR ' + f, file.write)
except ftplib.all_errors, e:
print str(e)
file.close()
ftp.quit()
def list_files(ftp):
print "Getting directory listing...\n"
ftp.dir()
filelist = ftp.nlst()
#determine new files to DL, pass to get_files()
#for now we will download all each execute
get_files(ftp, filelist)
def get_conn(host,user,passwd):
ftp = ftplib.FTP()
try:
print "\nConnecting to " + host + "...\n"
ftp.connect(host, 21)
except ftplib.all_errors, e:
print str(e)
try:
print "Logging in...\n"
ftp.login(user, base64.b64decode(passwd))
except ftplib.all_errors, e:
print str(e)
ftp.set_pasv(True)
list_files(ftp)
def main():
host = "host.domain.com"
user = "admin"
passwd = "base64passwd"
get_conn(host,user,passwd)
if __name__ == '__main__':
main()
Output looks like this with file dddd.tar.gz being the big one and never finishes it.
Downloading file aaaa.del.gz
Downloading file bbbb.del.gz
Downloading file cccc.del.gz
Downloading file dddd.tar.gz