22

Is there an elegant way in Perl to find the newest file in a directory (newest by modification date)?

What I have so far is searching for the files I need, and for each one get it's modification time, push into an array containing the filename, modification time, then sort it.

There must be a better way.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Tom Feiner
  • 20,656
  • 20
  • 48
  • 51

6 Answers6

26

Your way is the "right" way if you need a sorted list (and not just the first, see Brian's answer for that). If you don't fancy writing that code yourself, use this

use File::DirList;
my @list = File::DirList::list('.', 'M');

Personally I wouldn't go with the ls -t method - that involves forking another program and it's not portable. Hardly what I'd call "elegant"!


Regarding rjray's solution hand coded solution, I'd change it slightly:

opendir(my $DH, $DIR) or die "Error opening $DIR: $!";
my @files = map { [ stat "$DIR/$_", $_ ] } grep(! /^\.\.?$/, readdir($DH));
closedir($DH);

sub rev_by_date { $b->[9] <=> $a->[9] }
my @sorted_files = sort rev_by_date @files;

After this, @sorted_files contains the sorted list, where the 0th element is the newest file, and each element itself contains a reference to the results of stat, with the filename itself in the last element:

my @newest = @{$sorted_files[0]};
my $name = pop(@newest);

The advantage of this is that it's easier to change the sorting method later, if desired.


EDIT: here's an easier-to-read (but longer) version of the directory scan, which also ensures that only plain files are added to the listing:

my @files;
opendir(my $DH, $DIR) or die "Error opening $DIR: $!";
while (defined (my $file = readdir($DH))) {
  my $path = $DIR . '/' . $file;
  next unless (-f $path);           # ignore non-files - automatically does . and ..
  push(@files, [ stat(_), $path ]); # re-uses the stat results from '-f'
}
closedir($DH);

NB: the test for defined() on the result of readdir() is because a file called '0' would cause the loop to fail if you only test for if (my $file = readdir($DH))

Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • Both `File::DirList` and `ls` requires installation (on Windows at least). ` – jfs Nov 30 '08 at 15:48
  • If you want *newest* then use `@l = File::DirList::list('.', 'M'); say $l[0][0][13]` Note the capital M. – jfs Nov 30 '08 at 15:51
  • I wouldn't call his the "right" way. It's going to be a slug for a directory with many files. – brian d foy Nov 30 '08 at 17:28
  • Both `File::DirList::list` and `ls -t` return both filenames and *dirnames*. – jfs Nov 30 '08 at 17:33
  • brian - no more so that calling 'ls'. JFS - my new Perl version excludes dirnames. – Alnitak Nov 30 '08 at 17:35
  • You don't want to make paths like that. Use File::Spec to help you. – brian d foy Nov 30 '08 at 20:47
  • Brian - these are examples of _some_ improvements on another poster's example, not a howto on "perfect" portable coding. It would have been better to say _why_ you've changed your own example rather than just change it and delete the comments. – Alnitak Nov 30 '08 at 22:31
15

You don't need to keep all of the modification times and filenames in a list, and you probably shouldn't. All you need to do is look at one file and see if it's older than the oldest you've previously seen:

{
    opendir my $dh, $dir or die "Could not open $dir: $!";

    my( $newest_name, $newest_time ) = ( undef, 2**31 -1 );

    while( defined( my $file = readdir( $dh ) ) ) {
        my $path = File::Spec->catfile( $dir, $file );
        next if -d $path; # skip directories, or anything else you like
        ( $newest_name, $newest_time ) = ( $file, -M _ ) if( -M $path < $newest_time );
    }

    print "Newest file is $newest_name\n";
}
Matthew Lock
  • 13,144
  • 12
  • 92
  • 130
brian d foy
  • 129,424
  • 31
  • 207
  • 592
  • It doesn't filter directories names. – jfs Nov 30 '08 at 21:47
  • 1
    Since `$newest_time` is automatically initialized to 0, `-M $path` can never be less. You can initialize it like this: `$newest_time = 2**31 - 1` – Dennis Williamson May 10 '12 at 20:24
  • 2
    You've caught a real problem, but for the wrong reason. `-M` is negative when the file is modified after the program starts. I should set the newest time if it is not yet defined. $newest_time is not automatically initialized to zero: it's converted to 0 if it's undefined and I use it numerically (as in the `<` operator). You don't want to set it to some magic value either. You want to have the absence of value until you have a meaningful one. :) – brian d foy May 12 '12 at 00:30
  • There's still a bug in this code: for the first file -M _ returns some random value from the stat cache since the -M $path is not executed. – soger Jun 29 '16 at 16:02
  • Could someone expand on @soger 's comment. I'd like to use (a modification of) this snippet, but I need it to be bug free of course. Brian, do you understand what soger is trying to say? – Bram Vanroy Sep 14 '17 at 19:52
  • 1
    The code has been corrected since my comment, you can use it @BramVanroy. – soger Sep 15 '17 at 10:15
11

you could try using the shell's ls command:

@list = `ls -t`;
$newest = $list[0];
Nathan Fellman
  • 122,701
  • 101
  • 260
  • 319
  • Only works on UNIX (or Windows with an ls command in, say, Cygwin), but it IS a more elegant solution. – paxdiablo Nov 30 '08 at 10:29
  • It works on Windows (using gnuwin32 utilities). But `ls -t` returns both file and *directory* names. – jfs Nov 30 '08 at 15:27
  • This isn't elegant at all. It's just less typing. Now you need to create a new process for every directory you want to examine. Hardly pretty. :) – brian d foy Nov 30 '08 at 17:29
6

Assuming you know the $DIR you want to look in:

opendir(my $DH, $DIR) or die "Error opening $DIR: $!";
my %files = map { $_ => (stat("$DIR/$_"))[9] } grep(! /^\.\.?$/, readdir($DH));
closedir($DH);
my @sorted_files = sort { $files{$b} <=> $files{$a} } (keys %files);
# $sorted_files[0] is the most-recently modified. If it isn't the actual
# file-of-interest, you can iterate through @sorted_files until you find
# the interesting file(s).

The grep that wraps the readdir filters out the "." and ".." special files in a UNIX(-ish) filesystem.

rjray
  • 5,525
  • 4
  • 31
  • 37
  • Isn't this almost what Bonzo is saying he's doing? "searching for the files I need, and for each one get it's modification time, push into an array containing the filename, modification time, then sort it." You just change the array of tuples for a hash. – Vinko Vrsalovic Nov 30 '08 at 10:19
2

If you can't let ls do the sorting for you as @Nathan suggests, then you can optimize your process by only keeping the newest modification time and associated filename seen thus far and replace it every time you find a newer file in the directory. No need to keep any files around that you know are older than the newest one you've seen so far and certainly no need to sort them since you can detect which is the newest one while reading from the directory.

tvanfosson
  • 524,688
  • 99
  • 697
  • 795
-1

Subject is old, but maybe someone will try it - it isn't portable (Unix-like systems only), but it's quite simple and works:

chdir $directory or die "cannot change directory";

my $newest_file = bash -c 'ls -t | head -1';

chomp $newest_file;

print "$newest_file \n";

Community
  • 1
  • 1
  • 3
    -1 I don't see any advantage of making this a shell script problem, especially when the proposed shell script attempts to parse the output of `ls`. Perl is well equipped to handle the problematic corner cases (file names with newlines, etc). – tripleee Aug 21 '12 at 07:58