9

I would like to write a Python script which allows me to delete files from a FTP Server after they have reached a certain age. I prepared the scipt below but it throws the error message: WindowsError: [Error 3] The system cannot find the path specified: '/test123/*.*'

Do someone have an idea how to resolve this issue? Thank you in advance!

import os, time
from ftplib import FTP

ftp = FTP('127.0.0.1')
print "Automated FTP Maintainance"
print 'Logging in.'
ftp.login('admin', 'admin')

# This is the directory that we want to go to
path = 'test123'
print 'Changing to:' + path
ftp.cwd(path)
files = ftp.retrlines('LIST')
print 'List of Files:' + files 
#--everything works fine until here!...

#--The Logic which shall delete the files after the are 7 days old--
now = time.time()
for f in os.listdir(path):
  if os.stat(f).st_mtime < now - 7 * 86400:
    if os.path.isfile(f):
        os.remove(os.path.join(path, f))
except:
    exit ("Cannot delete files")

print 'Closing FTP connection'
ftp.close()
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Tom
  • 111
  • 1
  • 2
  • 8

5 Answers5

10

OK. Assuming your FTP server supports the MLSD command, make a module with the following code (this is code from a script I use to sync a remote FTP site with a local directory):

module code

# for python ≥ 2.6
import sys, os, time, ftplib
import collections
FTPDir= collections.namedtuple("FTPDir", "name size mtime tree")
FTPFile= collections.namedtuple("FTPFile", "name size mtime")

class FTPDirectory(object):
    def __init__(self, path='.'):
        self.dirs= []
        self.files= []
        self.path= path

    def getdata(self, ftpobj):
        ftpobj.retrlines('MLSD', self.addline)

    def addline(self, line):
        data, _, name= line.partition('; ')
        fields= data.split(';')
        for field in fields:
            field_name, _, field_value= field.partition('=')
            if field_name == 'type':
                target= self.dirs if field_value == 'dir' else self.files
            elif field_name in ('sizd', 'size'):
                size= int(field_value)
            elif field_name == 'modify':
                mtime= time.mktime(time.strptime(field_value, "%Y%m%d%H%M%S"))
        if target is self.files:
            target.append(FTPFile(name, size, mtime))
        else:
            target.append(FTPDir(name, size, mtime, self.__class__(os.path.join(self.path, name))))

    def walk(self):
        for ftpfile in self.files:
            yield self.path, ftpfile
        for ftpdir in self.dirs:
            for path, ftpfile in ftpdir.tree.walk():
                yield path, ftpfile

class FTPTree(FTPDirectory):
    def getdata(self, ftpobj):
        super(FTPTree, self).getdata(ftpobj)
        for dirname in self.dirs:
            ftpobj.cwd(dirname.name)
            dirname.tree.getdata(ftpobj)
            ftpobj.cwd('..')

single directory case

If you want to work on the files of a directory, you can:

import ftplib, time

quite_old= time.time() - 7*86400 # seven days

site= ftplib.FTP(hostname, username, password)
site.cwd(the_directory_to_work_on) # if it's '.', you can skip this line
folder= FTPDirectory()
folder.getdata(site) # get the filenames
for path, ftpfile in folder.walk():
    if ftpfile.mtime < quite_old:
        site.delete(ftpfile.name)

This should do what you want.

a directory and its descendants

Now, if this should work recursively, you'll have to do the following two changes in the code for “single directory case”:

folder= FTPTree()

and

site.delete(os.path.join(path, ftpfile.name))

Possible caveat

The servers I've worked with didn't have any issues with relative paths in the STOR and DELE commands, so site.delete with a relative path worked too. If your FTP server requires pathless filenames, you should first .cwd to the path provided, .delete the plain ftpfile.name and then .cwd back to the base folder.

tzot
  • 92,761
  • 29
  • 141
  • 204
  • Hi ΤΖΩΤΖΙΟΥ, thank you for your idea, it looks very good to me. I have tried it out, and I had to modidy the code slightly, but I get an error message: site= ftplib.FTP('127.0.0.1, admin, admin') File "C:\Python26\lib\ftplib.py", line 116, in __init__ self.connect(host) File "C:\Python26\lib\ftplib.py", line 131, in connect self.sock = socket.create_connection((self.host, self.port), self.timeout) for res in getaddrinfo(host, port, 0, SOCK_STREAM): socket.gaierror: [Errno 11001] getaddrinfo failed – Tom Aug 04 '10 at 12:48
  • import os, time, FTP_AUTO from ftplib import FTP quite_old= time.time() - 7*86400 # seven days # C:\Temp\ftp\test123 site= ftplib.FTP('127.0.0.1, admin, admin') site.cwd(test123) # if it's '.', you can skip this line folder= FTPDirectory() print folder folder.getdata(site) # get the filenames for path, ftpfile in folder.walk(): if ftpfile.mtime < quite_old: site.delete(ftpfile.name) – Tom Aug 04 '10 at 12:52
  • @Tom: `'127.0.0.1, admin, admin'` is not a valid hostname; that's what the error is about. You probably meant `'127.0.0.1', 'admin', 'admin'` in your code. – tzot Aug 04 '10 at 12:52
  • 1
    Thank you, the connection is now working. But the system stated that: File "G:/MY_TCS/!!PROJECTS/Q3/FTP_auto_del/python/ftp_del.py", line 6, in folder= FTPDirectory() NameError: name 'FTPDirectory' is not defined – Tom Aug 04 '10 at 13:01
  • @Tom: how did you name my module? Did you import it at the start of ftp_del.py? If you saved my code as, say, ftptool.py, then at the start of ftp_del.py you should `import ftptool` and later have the classes prefixed with the module name, e.g. `folder = ftptool.FTPDirectory()`. ISTM you need to read the Python tutorial first; it's like you lack basic knowledge about Python. – tzot Aug 04 '10 at 19:50
  • Hi ΤΖΩΤΖΙΟΥ, I named your module "FTP_dir" in that case. I import it as you mentioned. Now it seems to work! The old files are deleted from my test FTP server, now I will try it on the productive environment. Thank you very much for your assistance and help! It responses in the console with All look GOOD! – Tom Aug 05 '10 at 09:23
  • It worked on test environment Windows Based FileZilla Server, but in productive environment I get the error: ftplib.error_perm: 500 Cannot understand 'MLSD'" Would theren be an workaround for this issue? Can the provider just switch "MLSD" commands on? – Tom Aug 05 '10 at 15:39
  • This is terrific code! Some things: @Tom MLSD was officially implemented in 2007, so you might need to update your FTP server. The reason it was done was bc every FTP server used a different format with LIST. NOTE: In function addline, You should convert field_name to lowercase. There are servers such as ServU that return uppercase field names. field_name = field_name.lower() – SilentSteel Jul 16 '13 at 07:42
  • I had to turn the `field_name` into lower case as the FTP server was returning `Type`, `Modify` etc and this checks for `type`, `modify` etc. – David Dec 08 '16 at 09:22
4

I had to do this and it took a while, thought I could save someones time here. We are using python with ftputil module installed:

#! /usr/bin/python
import time
import ftputil
host = ftputil.FTPHost('ftphost.com', 'username', 'password')
mypath = 'ftp_dir'
now = time.time()
host.chdir(mypath)
names = host.listdir(host.curdir)
for name in names:
    if host.path.getmtime(name) < (now - (7 * 86400)):
      if host.path.isfile(name):
         host.remove(name)


print 'Closing FTP connection'
host.close()
2

OK, well rather than analyze the code you have posted any further, here's an example instead that might put you on the right track.

from ftplib import FTP
import re

pattern = r'.* ([A-Z|a-z].. .. .....) (.*)'

def callback(line):
    found = re.match(pattern, line)
    if (found is not None):
        print found.groups()

ftp = FTP('myserver.wherever.com')
ftp.login('elvis','presley')
ftp.cwd('testing123')
ftp.retrlines('LIST',callback)

ftp.close()
del ftp

Run it and you'll get output something like this, which should be a start towards what you're trying to achieve. To finish it out you'd need to parse the first result into a datetime, compare it with "now" and use ftp.delete() to get rid of the remote file if it's too old.

>>> 
('May 16 13:47', 'Thumbs.db')
('Feb 16 17:47', 'docs')
('Feb 23  2007', 'marvin')
('May 08  2009', 'notes')
('Aug 04  2009', 'other')
('Feb 11 18:24', 'ppp.xml')
('Jan 20  2010', 'reports')
('Oct 10  2005', 'transition')
>>> 
eemz
  • 1,183
  • 6
  • 10
  • Note however that different ftp servers format the output of the LIST command differently, so you may have to modify the regular expression to match the one you're using. – eemz May 19 '10 at 17:55
  • Hi thank you for your answer, I will try to modify my code accordingly. – Tom May 20 '10 at 09:01
  • I like the solution , very easy but isn't complete , we got deal with dates etc... – Sérgio Jan 27 '14 at 03:10
0

Well, it looks like the error you are seeing has to do with the fact that you are trying to remove the 'test123' directory from your local machine, not the FTP site. The FTP docs have a method called delete, and that's what you'd want to use to remove the file. As far as testing whether or not something is 7 days old or not, you might actually have to pull those files down from the FTP temporarily then check the modify times before using FTP.delete.

Ben Hayden
  • 1,349
  • 9
  • 15
  • No it shall jump into the directory "test123", and then delete every file from it which is older then 7 days. The machine is indicating that it is not able to find the directory. – Tom May 20 '10 at 08:54
0

What OS are you running on? The file path /test123/*.* is Unix-style yet the message says WindowsError. Are you taking the output of an ftp LIST command, which is in Unix-style, and trying to use it verbatim in a Windows script?

eemz
  • 1,183
  • 6
  • 10
  • Hi is is running on Windows 2003 Server, and it connects currently to an test FTP Server wich is running on Windows XP. – Tom May 20 '10 at 08:52