180

In Bash, I would like to create a function that returns the filename of the newest file that matches a certain pattern. For example, I have a directory of files like:

Directory/
   a1.1_5_1
   a1.2_1_4
   b2.1_0
   b2.2_3_4
   b2.3_2_0

I want the newest file that starts with 'b2'. How do I do this in bash? I need to have this in my ~/.bash_profile script.

Lesmana
  • 25,663
  • 9
  • 82
  • 87
jlconlin
  • 14,206
  • 22
  • 72
  • 105
  • 4
    see http://superuser.com/questions/294161/unix-linux-find-and-sort-by-date-modified for more answer hints. The sorting is the key step to get your newest file – Wolfgang Fahl Dec 04 '16 at 13:06

9 Answers9

297

The ls command has a parameter -t to sort by time. You can then grab the first (newest) with head -1.

ls -t b2* | head -1

But beware: Why you shouldn't parse the output of ls

My personal opinion: parsing ls is dangerous when the filenames can contain funny characters like spaces or newlines.

If you can guarantee that the filenames will not contain funny characters (maybe because you are in control of how the files are generated) then parsing ls is quite safe.

If you are developing a script which is meant to be run by many people on many systems in many different situations then do not parse ls.

Here is how to do it safe: How can I find the latest (newest, earliest, oldest) file in a directory?

unset -v latest
for file in "$dir"/*; do
  [[ $file -nt $latest ]] && latest=$file
done
Lesmana
  • 25,663
  • 9
  • 82
  • 87
  • 11
    Note to others: if you are doing this for a directory, you would add the -d option to ls, like this 'ls -td | head -1' – ken.ganong Dec 27 '12 at 19:53
  • 6
    The [parsing LS](http://mywiki.wooledge.org/ParsingLs) link says not to do this and recommends the methods in [BashFAQ 99](http://mywiki.wooledge.org/BashFAQ/099). I'm looking for a 1-liner rather than something bullet-proof to include in a script, so I'll continue to parse ls unsafely like @lesmana. – Eponymous Jun 05 '14 at 00:07
  • 1
    @Eponymous: If you're looking for a one liner without using the fragile `ls`, `printf "%s\n" b2* | head -1` will do it for you. – David Ongaro Dec 01 '16 at 04:27
  • @Eponymous In the question the "newest" file is implied by the actual file name (increasing version numbers) so lexical ordering is correct here. Relying on the modification timestamp is less reliable here since files can be touched. – David Ongaro Dec 01 '16 at 18:25
  • 3
    @DavidOngaro The question does not say that the filenames are version numbers. This is about modification times. Even with the filename assumption `b2.10_5_2` kills this solution. – Eponymous Dec 01 '16 at 22:38
  • @Eponymous: It also doesn't say anything about "modification times", so what "newest" means is up to interpretation. Of course if one orders version numbers lexicographically one has to consider the appropriate number of 0 prefixes beforehand depending how big they'll get. – David Ongaro Dec 02 '16 at 01:15
  • @DavidOngaro shouldn't that be `tail -1`, not `head`? Head will give you the lexically oldest, won't it? – Fadecomic Feb 22 '17 at 22:47
  • @Fadecomic: Thanks for pointing that out. Indeed the lexically newest should be at the end while `ls -t` has the file with the newest timestamp at the top. – David Ongaro Feb 22 '17 at 23:53
  • 1
    Your one liner is giving me right answer, but the "right" way is actually giving me the *oldest* file. Any idea why? – NewNameStat Apr 04 '19 at 21:59
  • As for the security of parsing *`ls`*: if you can verify that the files do not have any abnormal characters i.e. you yourself made them and you know it cannot be changed then it can be considered 'safe'. But as soon as you introduce other uses it is another matter entirely. And even then: what if you didn't foresee something? So it's a choice one has to make but a choice that requires awareness. Which ultimately is a huge problem in our world. – Pryftan Oct 29 '19 at 13:31
  • @Fadecomic That or using the *`-r`* option to *`ls`*. – Pryftan Oct 29 '19 at 13:31
  • A nice "add-on" for log files: with something like this `ls -t b2* | head -1 | xargs tail -f`, shows the an auto-refresh tail – philshem May 14 '20 at 06:33
  • The second variant ("how to do it right") did not work for me, it gave me "vmlinuz.old" as a result. I used `file=$(ls -t *.prg | head -1)` in the end, when I quote the filename in further usage (`"$file"`), it works fine even with spaces in filenames. – Peter B. Jan 03 '21 at 15:23
  • This will run into "argument list too long" if the wildcard matches too many files. The second solution avoids that, but suffers from [broken quoting.](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Jan 06 '22 at 08:38
34

The combination of find and ls works well for

  • filenames without newlines
  • not very large amount of files
  • not very long filenames

The solution:

find . -name "my-pattern" -print0 |
    xargs -r -0 ls -1 -t |
    head -1

Let's break it down:

With find we can match all interesting files like this:

find . -name "my-pattern" ...

then using -print0 we can pass all filenames safely to the ls like this:

find . -name "my-pattern" -print0 | xargs -r -0 ls -1 -t

additional find search parameters and patterns can be added here

find . -name "my-pattern" ... -print0 | xargs -r -0 ls -1 -t

ls -t will sort files by modification time (newest first) and print it one at a line. You can use -c to sort by creation time. Note: this will break with filenames containing newlines.

Finally head -1 gets us the first file in the sorted list.

Note: xargs use system limits to the size of the argument list. If this size exceeds, xargs will call ls multiple times. This will break the sorting and probably also the final output. Run

xargs  --show-limits

to check the limits on you system.

Note 2: use find . -maxdepth 1 -name "my-pattern" -print0 if you don't want to search files through subfolders.

Note 3: As pointed out by @starfry - -r argument for xargs is preventing the call of ls -1 -t, if no files were matched by the find. Thank you for the suggesion.

Boris Brodski
  • 8,425
  • 4
  • 40
  • 55
  • 2
    This is better than the ls based solutions, as it works for directories with extremely many files, where ls chokes. – Marcin Zukowski Sep 05 '18 at 18:35
  • `find . -name "my-pattern" ... -print0` gives me `find: paths must precede expression: \`...'` – Jaakko Feb 06 '19 at 14:24
  • 1
    Oh! `...` stands for "more parameters". Just omit it, if you don't need it. – Boris Brodski Feb 06 '19 at 14:31
  • 3
    I found that this can return a file that does not match the pattern if there are no files that do match the pattern. It happens because find passes nothing to xargs which then invokes ls with no file lists, causing it to work on all files. The solution is to add `-r` to the xargs command-line which tells xargs not to run its command-line if it receives nothing on its standard input. – starfry Mar 01 '19 at 10:15
  • @starfry thank you! Nice catch. I added `-r` to the answer. – Boris Brodski Mar 06 '19 at 10:08
  • As others have mentioned, this answer is incorrect. `find: paths must precede expression: `...'`. Please edit your answer to remove the incorrect code. Don't just append more text. That just adds noise. – Cerin Apr 24 '20 at 03:07
  • @Cerin fixed it. The tripple dots "..." stud here as a placeholder for "more parameters" and shouldn't be executed as a part of the command. See the third comment above. Removed it to prevent confusion about it. – Boris Brodski Apr 24 '20 at 08:33
  • This is still incorrect if you have many files, because `xargs` will then invoke `ls -t` multiple times, and `head -1` will get the first file from the first invocation of `ls -t`, not the oldest overall. See https://stackoverflow.com/questions/64320196/why-does-find-return-different-sorted-results-when-passed-the-current-directory – tripleee Jan 06 '22 at 08:28
  • @tripleee This is exactly, what I described in the "Note" section. It is still useful and relatively simple solution for most cases. – Boris Brodski Jan 13 '22 at 08:51
  • The note doesn't really tell beginners that this is buggy in its current form, and obviously offers no remedy. – tripleee Jan 13 '22 at 08:53
  • @tripleee The first sentence is "The combination of find and ls works well for ... not very large amount of files". IMHO It defines the frame of the application of this solution pretty well. In bash you most of the time have to trade between practical 95% solutions and very involved 100% solutions. My solution is modular, showing beginners how to solve not only this particular problem, but also a lot of other problems by combining tools, like find and ls using args and print0. – Boris Brodski Jan 13 '22 at 09:01
12

This is a possible implementation of the required Bash function:

# Print the newest file, if any, matching the given pattern
# Example usage:
#   newest_matching_file 'b2*'
# WARNING: Files whose names begin with a dot will not be checked
function newest_matching_file
{
    # Use ${1-} instead of $1 in case 'nounset' is set
    local -r glob_pattern=${1-}

    if (( $# != 1 )) ; then
        echo 'usage: newest_matching_file GLOB_PATTERN' >&2
        return 1
    fi

    # To avoid printing garbage if no files match the pattern, set
    # 'nullglob' if necessary
    local -i need_to_unset_nullglob=0
    if [[ ":$BASHOPTS:" != *:nullglob:* ]] ; then
        shopt -s nullglob
        need_to_unset_nullglob=1
    fi

    newest_file=
    for file in $glob_pattern ; do
        [[ -z $newest_file || $file -nt $newest_file ]] \
            && newest_file=$file
    done

    # To avoid unexpected behaviour elsewhere, unset nullglob if it was
    # set by this function
    (( need_to_unset_nullglob )) && shopt -u nullglob

    # Use printf instead of echo in case the file name begins with '-'
    [[ -n $newest_file ]] && printf '%s\n' "$newest_file"

    return 0
}

It uses only Bash builtins, and should handle files whose names contain newlines or other unusual characters.

pjh
  • 6,388
  • 2
  • 16
  • 17
  • 1
    You could you use `nullglob_shopt=$(shopt -p nullglob)` and then later `$nullglob` to put back `nullglob` how it was before. – gniourf_gniourf Nov 05 '14 at 21:14
  • The suggestion by @gniourf_gniourf to use $(shopt -p nullglob) is a good one. I generally try to avoid using command substitution (`$()` or backticks) because it is slow, particularly under Cygwin, even when the command only uses builtins. Also, the subshell context in which the commands get run can sometimes cause them to behave in unexpected ways. I also try to avoid storing commands in variables (like `nullglob_shopt`) because very bad things can happen if you get the value of the variable wrong. – pjh Nov 06 '14 at 20:25
  • I appreciate the attention to details that can lead to obscure failure when overlooked. Thanks! – Ron Burk Jul 31 '16 at 18:32
  • I love that you went for a more unique way to solve the problem! It's a certainty that in Unix/Linux there is more than one way to 'skin the *`cat`*!'. Even if this takes more work it has the benefit of showing people concepts. Have a +1! – Pryftan Oct 29 '19 at 13:34
9

Use the find command.

Assuming you're using Bash 4.2+, use -printf '%T+ %p\n' for file timestamp value.

find $DIR -type f -printf '%T+ %p\n' | sort -r | head -n 1 | cut -d' ' -f2

Example:

find ~/Downloads -type f -printf '%T+ %p\n' | sort -r | head -n 1 | cut -d' ' -f2

For a more useful script, see the find-latest script here: https://github.com/l3x/helpers

l3x
  • 30,760
  • 1
  • 55
  • 36
  • to work with file names that contains spaces change cut -d' ' -f2,3,4,5,6,7,8,9 ... – valodzka Apr 02 '20 at 09:17
  • 1
    The version of Bash is unimportant. You need to have GNU `find` because the `-printf` option is non-standard (so typically, out of the box, this will only work on Linux). – tripleee Jan 06 '22 at 08:31
6

You can use stat with a file glob and a decorate-sort-undecorate with the file time added on the front:

$ stat -f "%m%t%N" b2* | sort -rn | head -1 | cut -f2-

As stated in comments, the best cross platform solution may be with a Python, Perl, Ruby script.

For such things, I tend to use Ruby since it is very awk like in the ease of writing small, throw away scripts yet has the power of Python or Perl right from the command line.

Here is a ruby:

ruby -e '
# index [0] for oldest and [-1] for newest
newest=Dir.glob("*").
    reject { |f| File.directory?(f)}.
    sort_by { |f| File.birthtime(f) rescue File.mtime(f) 
    }[-1]
p newest'

That gets the newest file in the current working directory.

You can also make the glob recursive by using **/* in glob or limit to matched files with b2*, etc

dawg
  • 98,345
  • 23
  • 131
  • 206
  • nope. "stat: cannot read file system information for '%m%t%N': No such file or directory" – Ken Ingram Aug 15 '19 at 01:26
  • 3
    I think this might be for the Mac/FreeBSD version of `stat`, if I'm remembering its options correctly. To get similar output on other platforms, you could use `stat -c $'%Y\t%n' b2* | sort -rn | head -n1 | cut -f2-` – Jeffrey Cash May 11 '20 at 05:35
  • 1
    With "other platforms" you probably mean Linux. There are other platforms still which require different options, or, in the wrorst case, don't easily provide this level of granularity of control over the behavior of `stat`. If you need a portable solution, paradoxically, maybe write a Perl or Python script. – tripleee Jan 06 '22 at 08:36
6

A Bash function to find the newest file under a directory matching a pattern

#1.  Make a bash function:
newest_file_matching_pattern(){ 
    find $1 -name "$2" -print0 | xargs -0 ls -1 -t | head -1  
} 
 
#2. Setup a scratch testing directory: 
mkdir /tmp/files_to_move;
cd /tmp/files_to_move;
touch file1.txt;
touch file2.txt; 
touch foobar.txt; 
 
#3. invoke the function: 
result=$(newest_file_matching_pattern /tmp/files_to_move "file*") 
printf "result: $result\n"

Prints:

result: /tmp/files_to_move/file2.txt

Or if brittle bash parlor tricks subcontracting to python interpreter is more your angle, this does the same thing:

#!/bin/bash 
 
function newest_file_matching_pattern { 
python - <<END 
import glob, os, re  
print(sorted(glob.glob("/tmp/files_to_move/file*"), key=os.path.getmtime)[0]); 
END 
} 
 
result=$(newest_file_matching_pattern) 
printf "result: $result\n" 

Prints:

result: /tmp/files_to_move/file2.txt
Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
  • That's [broken quoting](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) and a [useless use of `echo`](https://www.iki.fi/era/unix/award.html#echo) but also more fatally [broken for large directories](https://stackoverflow.com/questions/64320196/why-does-find-return-different-sorted-results-when-passed-the-current-directory) – tripleee Jan 06 '22 at 08:33
  • @tripleee All excellent bash tips and links. Cringe code from 3 years ago made less bad. – Eric Leschinski Jan 06 '22 at 16:36
4

Unusual filenames (such as a file containing the valid \n character can wreak havoc with this kind of parsing. Here's a way to do it in Perl:

perl -le '@sorted = map {$_->[0]} 
                    sort {$a->[1] <=> $b->[1]} 
                    map {[$_, -M $_]} 
                    @ARGV;
          print $sorted[0]
' b2*

That's a Schwartzian transform used there.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
1

For googlers:

ls -t | head -1

  • -t sorts by last modification datetime
  • head -1 only returns the first result

(Don't use in production)

Tobias Feil
  • 2,399
  • 3
  • 25
  • 41
-2

There is a much more efficient way of achieving this. Consider the following command:

find . -cmin 1 -name "b2*"

This command finds the latest file produced exactly one minute ago with the wildcard search on "b2*". If you want files from the last two days then you'll be better off using the command below:

find . -mtime 2 -name "b2*"

The "." represents the current directory. Hope this helps.

Naufal
  • 1,203
  • 14
  • 12
  • 10
    This doesn't actually find the "newest file matching pattern"... it just find all the files matching pattern created a minute ago, or modified two days ago. – GnP Sep 12 '17 at 03:23
  • This answer was based on the question posed. Also, you can tweak the command to look at the latest file that came in a day or so ago. It depends on what you're trying to do. – Naufal Sep 12 '17 at 15:35
  • "tweaking" is not the answer. it's like posting this as an answer: "Just tweak the find command and find the answer depending on what you want to do" . – Kennet Celeste Jun 12 '19 at 19:53
  • Not sure about the unnecessary comment. If you feel like my answer does not substantiate, then please provide proper reason to why my answer doesn't make sense with EXAMPLES. If unable to do so, then please refrain from commenting further. – Naufal Jun 14 '19 at 10:15
  • 2
    Your solution requires you to know *when* the latest file was created. That was not in the question so no, your answer is not based on the question posed. – Bloke Down The Pub Jul 28 '19 at 14:05