31

I have a cache folder that stores html files. They are overwritten when needed, but a lot of the time, rarely used pages are cached in there also, that just end up using space (after 5 weeks, the drive was full with over 2.7 million cache files).

Whats the best way to loop thru a directory that contains several hundreds of thousands of files, and remove files that are older than 1 day?

Gordon
  • 312,688
  • 75
  • 539
  • 559
  • 4
    Is there a reason why you need to do this in PHP? You might find a shell-scripting language more appropriate for this. – Dominic Rodger Feb 05 '10 at 08:05
  • You can do all of this and more using [the Linux `find` command](https://askubuntu.com/a/589224/372950) – rinogo Jul 28 '17 at 15:26

8 Answers8

57

I think you could go about this by looping through the directory with readdir and delete based on the timestamp:

<?php
$path = '/path/to/files/';
if ($handle = opendir($path)) {

    while (false !== ($file = readdir($handle))) { 
        $filelastmodified = filemtime($path . $file);
        //24 hours in a day * 3600 seconds per hour
        if((time() - $filelastmodified) > 24*3600)
        {
           unlink($path . $file);
        }

    }

    closedir($handle); 
}
?>

The if((time() - $filelastmodified) > 24*3600) will select files older than 24 hours (24 hours times 3600 seconds per hour). If you wanted days, it should read for example 7*24*3600 for files older than a week.

Also, note that filemtime returns the time of last modification of the file, instead of creation date.

Fl1p
  • 81
  • 13
Pawel J. Wal
  • 1,166
  • 1
  • 11
  • 24
11

It should be

if((time()-$filelastmodified) > 24*3600 && is_file($file))

to avoid errors for the . and .. directories.

sth
  • 222,467
  • 53
  • 283
  • 367
ketan
  • 390
  • 3
  • 6
  • 2
    Better to check if `$file == '.' || $file == '..'` to save time from checking `is_file()` every time... – barell Oct 20 '14 at 09:53
6

The below function lists the file based on their creation date:

private function listdir_by_date( $dir ){
  $h = opendir( $dir );
  $_list = array();
  while( $file = readdir( $h ) ){
    if( $file != '.' and $file != '..' ){
      $ctime = filectime( $dir . $file );
      $_list[ $file ] = $ctime;
    }
  }
  closedir( $h );
  krsort( $_list );
  return $_list;
}

Example:

$_list = listdir_by_date($dir);

Now you can loop through the list to see their dates and delete accordingly:

$now = time();
$days = 1;
foreach( $_list as $file => $exp ){
  if( $exp < $now-60*60*24*$days ){
    unlink( $dir . $file );
  }
}
Linkmichiel
  • 2,110
  • 4
  • 23
  • 27
Sarfraz
  • 377,238
  • 77
  • 533
  • 578
  • Why would someone want to LOOP for getting the sorted list first and then LOOP again to delete? Doesn't make sense for overhead. – Osama Ibrahim Nov 16 '18 at 14:01
3

Try SplIterators

// setup timezone and get timestamp for yesterday
date_default_timezone_set('Europe/Berlin'); // change to yours
$yesterday = strtotime('-1 day', time());

// setup path to cache dir and initialize iterator
$path      = realpath('/path/to/files'); // change to yours
$objects   = new RecursiveIteratorIterator(
                 new RecursiveDirectoryIterator($path));

// iterate over files in directory and delete them
foreach($objects as $name => $object){
    if ($object->isFile() && ($object->getCTime() < $yesterday)) {
        // unlink($object);
        echo PHP_EOL, 'deleted ' . $object;
    }
}

Creation Time is only available on Windows.

Community
  • 1
  • 1
Gordon
  • 312,688
  • 75
  • 539
  • 559
2
/* Detele Cache Files Here */
$dir = "cache/"; /** define the directory **/

/*** cycle through all files in the directory ***/
foreach (glob($dir."*") as $file) {
//foreach (glob($dir.'*.*') as $file){

/*** if file is 24 hours (86400 seconds) old then delete it ***/
if (filemtime($file) < time() - 3600) { // 1 hour
    unlink($file);
    }
}

I am using this, hope it helps. Also, I updated this on 13-02-2023, visit: https://www.arnlweb.com/forums/server-management/efficient-php-code-for-deleting-files-delete-all-files-older-than-2-days-the-right-way/

Lachit
  • 81
  • 1
  • 5
0

just to note Gordon's time comparison (see above: https://stackoverflow.com/a/2205833/1875965) is the only correct one when comparing to 'days' rather than '24 hours', as not all days have 24 hours (summertime/wintertime etc).

E.g. use

// setup timezone and get timestamp for yesterday
date_default_timezone_set('Europe/Berlin'); // change as appropriate
$yesterday = strtotime('-1 day', time());

when comparing the file date.

This may not be a big issue, but can lead to unexpected behaviour when you're working with weeks/months etc. I found it best to stick to using the above method as it'll make any process involving dates/times consistent and avoid confusion.

Also check what the timezone is for the file dates, as sometimes the default for PHP differs from the system timezone.

Kind regards, Sandra.

Community
  • 1
  • 1
Sandra
  • 374
  • 6
  • 17
0
$directory = $_SERVER['DOCUMENT_ROOT'].'/pathfromRoot/';

$files = array_slice(scandir($directory), 2);
foreach($files as $file)
{
    // $extension      = substr($file, -3, 3); 
    // if ($extension == 'jpg') // in case you only want specific files deleted
    // {
    $stat = stat($directory.$file);
    $filedate = date_create(date("Y-m-d", $stat['ctime']));
    $today = date_create(date("Y-m-d"));
    $days = date_diff($filedate, $today, true);
    // dd($days);
    if ($days->days > 180) 
    { 
        unlink($directory.$file);
    }
     // } 
}
Roberto Caboni
  • 7,252
  • 10
  • 25
  • 39
-1

By changing @pawel's solution I created function below. At first i forgot to add "path" to file name, which take my half hour to find out.

public function deleteOldFiles ($hours=24) {
    $path='cache'.DS;
    if ( $handle = opendir( $path ) ) {
        while (false !== ($file = readdir($handle))) {
            $filelastmodified = filemtime($path.$file);
            if((time()-$filelastmodified) > 24*3600 && is_file($path.$file))
            {
                unlink($path.$file);
            }
        }
        closedir($handle);
    }
}
trante
  • 33,518
  • 47
  • 192
  • 272