47

Question: How do you delete all files in a directory except the newest 3?

Finding the newest 3 files is simple:

ls -t | head -3

But I need to find all files except the newest 3 files. How do I do that, and how do I delete these files in the same line without making an unnecessary for loop for that?

I'm using Debian Wheezy and bash scripts for this.

bytecode77
  • 14,163
  • 30
  • 110
  • 141
  • 4
    `ls` is actually the wrong tool for the job -- see http://mywiki.wooledge.org/ParsingLs. If you have GNU find, you can do much better with a `-printf` format string that has the timestamp (ideally in UNIX time for `sort -n -z`), a separator, and then a NUL following; that way even filenames with newlines won't throw it off. – Charles Duffy Nov 05 '14 at 19:14
  • I'd also disagree that using a loop here is unnecessary. Doing things correctly and robustly isn't the same as doing them tersely, but anything else is... well... incorrect. – Charles Duffy Nov 05 '14 at 19:17

11 Answers11

90

This will list all files except the newest three:

ls -t | tail -n +4

This will delete those files:

ls -t | tail -n +4 | xargs rm --

this does not delete dotfiles. if you also want to delete dotfiles then change ls -t to ls -At.

the double dash (--) after rm is a safeguard against filenames starting with dash. see here for more info: https://unix.stackexchange.com/questions/1519/how-do-i-delete-a-file-whose-name-begins-with-hyphen-a-k-a-dash-or-minus

this command can fail horribly if the filenames contain spaces or newlines or other funny characters. if your filenames can contain space, or if you plan to use this in a script then you should read these articles: http://mywiki.wooledge.org/ParsingLs and http://mywiki.wooledge.org/BashFAQ/003

Lesmana
  • 25,663
  • 9
  • 82
  • 87
  • 1
    @DevilsChild: Depends on whether you care about being correct. If you don't care, just pipe to xargs... but don't **ever** do that if it's anything important (like backup scripts). – Charles Duffy Nov 05 '14 at 19:24
  • 7
    @DevilsChild I've literally seen TB of backups deleted because a buffer overflow created a file with garbage in its name, and someone had assumed that since filename creation was programmatic that unusual names could never happen. Taking shortcuts can bite you hard. – Charles Duffy Nov 05 '14 at 19:25
  • 2
    What is the purpose of the double dashes `--` in the `rm` command? – dokaspar Sep 09 '16 at 06:22
  • 3
    it is a safeguard against filenames starting with a dash or minus. http://unix.stackexchange.com/questions/1519/how-do-i-delete-a-file-whose-name-begins-with-hyphen-a-k-a-dash-or-minus – Lesmana Sep 09 '16 at 07:29
  • Very elegant solution :) – mafonya Aug 13 '18 at 13:56
  • In case you are not working in the current directory, you can output full path from ls like this: `ls -d /path/to/directory/* | sort -nr | tail -n +4 | xargs rm --` – Bojan Hrnkas Jun 19 '20 at 09:33
  • To cover the file that has spaces in file name, use this: ``ls -t | tail -n +4 | xargs -d "\n" -I {} rm {}`` – Youngmin Kim May 10 '22 at 00:55
33

Solution without problems with "ls" (strange named files)

This is a combination of ceving's and anubhava's answer. Both solutions are not working for me. Because I was looking for a script that should run every day for backing up files in an archive, I wanted to avoid problems with ls (someone could have saved some funny named file in my backup folder). So I modified the mentioned solutions to fit my needs.

My solution deletes all files, except the three newest files.

find . -type f -printf '%T@\t%p\n' |
sort -t $'\t' -g | 
head -n -3 | 
cut -d $'\t' -f 2- |
xargs -r rm

Some explanation:

find lists all files (not directories) in current folder. They are printed out with timestamps.
sort sorts the lines based on timestamp (oldest on top).
head prints out the top lines, up to the last 3 lines.
cut removes the timestamps.
xargs runs rm for every selected file, while -r lets it not fail when no files are found

For you to verify my solution:

(
touch -d "6 days ago" test_6_days_old
touch -d "7 days ago" test_7_days_old
touch -d "8 days ago" test_8_days_old
touch -d "9 days ago" test_9_days_old
touch -d "10 days ago" test_10_days_old
)

This creates 5 files with different timestamps in the current folder. Run this script first and then the code for deleting old files.

Marc Wittke
  • 2,991
  • 2
  • 30
  • 45
flohall
  • 967
  • 10
  • 19
  • Worked like a charm for my script ! – ieselisra Oct 18 '20 at 22:10
  • 2
    I like this script, but what happens when the filter doesn't catch any files? in my limited testing, `rm` exits with non-zero. As a test, run the file creation above and then the script, but use a value of `-8` in the `head` command. This returns no values, which makes `rm` exit with an error. However, it seems that the `-f` flag on `rm` will make it exit with 0, even if there were no results. So the last line could be modified as `xargs rm -f`, if you need the script to exit cleanly on no results. – EugeneRomero Jul 14 '21 at 07:51
  • 1
    You can alternatively add the "-r" option to xargs to handle that case as well. – Nikki Dec 09 '21 at 19:32
  • You are using the tab as delimiter. Won't a filename with tab in it break it? – mgutt May 30 '22 at 12:11
  • How does the `head -n -3` work if you have linefeeds in the filenames? I think you should use null separated strings and e.g. `head -z -n -3` instead. – Mikko Rantalainen Nov 16 '22 at 09:33
10

The following looks a bit complicated, but is very cautious to be correct, even with unusual or intentionally malicious filenames. Unfortunately, it requires GNU tools:

count=0
while IFS= read -r -d ' ' && IFS= read -r -d '' filename; do
  (( ++count > 3 )) && printf '%s\0' "$filename"
done < <(find . -maxdepth 1 -type f -printf '%T@ %P\0' | sort -g -z) \
     | xargs -0 rm -f --

Explaining how this works:

  • Find emits <mtime> <filename><NUL> for each file in the current directory.
  • sort -g -z does a general (floating-point, as opposed to integer) numeric sort based on the first column (times) with the lines separated by NULs.
  • The first read in the while loop strips off the mtime (no longer needed after sort is done).
  • The second read in the while loop reads the filename (running until the NUL).
  • The loop increments, and then checks, a counter; if the counter's state indicates that we're past the initial skipping, then we print the filename, delimited by a NUL.
  • xargs -0 then appends that filename into the argv list it's collecting to invoke rm with.
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • 2
    did you miss `-0` in xargs? also, you may skip the first three by using group and dummy read: `{ read; read; read; while ... done; } < <(find ...)` this will avoid the need for a counter. – gniourf_gniourf Nov 05 '14 at 19:25
  • @gniourf_gniourf, yes, but they need to be dummy reads with `-d ''`, which makes them long enough that I went for the counter. Good catch on the `-0`; I only tested prior to that point. – Charles Duffy Nov 05 '14 at 19:26
  • Would `IFS=' ' read -r -d '' stamp filename` work instead of the (separate) dummy read? – Etan Reisner Nov 05 '14 at 19:33
  • I would appreciate more of an explanation of how this works, especially because my OS X system doesn't take the `-printf` arg to `find`. In particular, how does the `%M@` _"File's permissions (in symbolic form, as for ls)"_ help? – Stephen P Nov 05 '14 at 19:47
  • @StephenP, ...oops -- did I not backport that fix? Had it over to mtime locally. – Charles Duffy Nov 05 '14 at 20:43
  • @EtanReisner, yes, it should, but with the possible caveat of trimming trailing whitespace off the end of filenames. – Charles Duffy Nov 05 '14 at 20:43
  • @StephenP, ...unfortunately, this isn't so easy to implement correctly without GNU tools; the easy solution that doesn't require `find -printf` uses `stat --format`, which is likewise a GNUism... as is, of course, `sort -z`. I'd almost be tempted to lean on perl, if needing to support other platforms while retaining correctness. – Charles Duffy Nov 05 '14 at 20:45
  • Thanks for the edit with the explanation, I'd upvote a 2nd time for it. I tried using `xargs -0 ls -l --` for non-destructive testing and I needed to change to `(( count > 3 ))` for keeping 3 newest files, and could change to `if (( ++count > 3 )); then`... to eliminate the separate increment. – Stephen P Nov 05 '14 at 21:17
  • @StephenP, good suggestion re: incorporating the increment into the test. – Charles Duffy Nov 05 '14 at 21:21
  • Do you have any reasons for not `rm`ing the file inside the while loop, instead of the `printf/xargs` combo? – gniourf_gniourf Nov 05 '14 at 21:32
  • And by the way, you need to repeat `IFS=` for the second `read`. – gniourf_gniourf Nov 05 '14 at 21:37
  • 2
    @gniourf_gniourf, only reason is to avoid the number of external command invocations, calling `rm` once per times MAX_ARGV fills up being faster than once per call. Good catch on the missing `IFS=`. – Charles Duffy Nov 05 '14 at 21:39
  • 1
    Maybe reverse the sort? Newest files will have bigger dates, so the order must be descending (since we are keeping the first three on that list). – Velkan Mar 15 '17 at 08:46
  • This will not work, because `sort -n` does not sort floating point numbers correctly. Use `sort -g` instead. – ceving Apr 26 '17 at 09:49
  • 2
    "Does not work" is a little strong -- the general use case is one where sub-second precision doesn't matter (and on many filesystems, subsecond precision isn't even there in the input data to start with) -- but since we're already dependent on GNU tools, there's no harm to the change. – Charles Duffy Apr 26 '17 at 12:26
  • What about using `head -z -n -3` and `sed -z 's/^.* //'` to get the list to feed to `xargs`? – mbirth Aug 02 '19 at 10:16
  • @mbirth, you'd want it to be `sed -z 's/^[^ ]* //'` to not mangle filenames with spaces, but otherwise, looks solid to me. – Charles Duffy Aug 02 '19 at 15:07
  • @CharlesDuffy Thanks, forgot about the greediness of sed. – mbirth Aug 04 '19 at 18:57
9
ls -t | tail -n +4 | xargs -I {} rm {}

If you want a 1 liner

Michael Ballent
  • 1,078
  • 9
  • 15
5

In zsh:

rm /files/to/delete/*(Om[1,-4])

If you want to include dotfiles, replace the parenthesized part with (Om[1,-4]D).

I think this works correctly with arbitrary chars in the filenames (just checked with newline).

Explanation: The parentheses contain Glob Qualifiers. O means "order by, descending", m means mtime (See man zshexpn for other sorting keys - large manpage; search for "be sorted"). [1,-4] returns only the matches at one-based index 1 to (last + 1 - 4) (note the -4 for deleting all but 3).

FunctorSalad
  • 2,502
  • 25
  • 20
4

Don't use ls -t as it is unsafe for filenames that may contain whitespaces or special glob characters.

You can do this using all gnu based utilities to delete all but 3 newest files in the current directory:

find . -maxdepth 1 -type f -printf '%T@\t%p\0' |
sort -z -nrk1 |
tail -z -n +4 |
cut -z -f2- |
xargs -0 rm -f --
anubhava
  • 761,203
  • 64
  • 569
  • 643
3
ls -t | tail -n +4 | xargs -I {} rm {}

Michael Ballent's answer works best as

ls -t | tail -n +4 | xargs rm --

throw me error if I have less than 3 file

kl3sk
  • 136
  • 1
  • 11
  • You want to pass `-r` to `xargs` or it will execute the command without parameters if the pipeline has no input. Unfortunately, `-r` is a GNU extension so it will not work with plain old POSIX compatible `xargs`. – Mikko Rantalainen Nov 16 '22 at 09:37
2

Recursive script with arbitrary num of files to keep per-directory

Also handles files/dirs with spaces, newlines and other odd characters

#!/bin/bash
if (( $# != 2 )); then
  echo "Usage: $0 </path/to/top-level/dir> <num files to keep per dir>"
  exit
fi

while IFS= read -r -d $'\0' dir; do
  # Find the nth oldest file
  nthOldest=$(find "$dir" -maxdepth 1 -type f -printf '%T@\0%p\n' | sort -t '\0' -rg \
    | awk -F '\0' -v num="$2" 'NR==num+1{print $2}')

  if [[ -f "$nthOldest" ]]; then
    find "$dir" -maxdepth 1 -type f ! -newer "$nthOldest" -exec rm {} +
  fi
done < <(find "$1" -type d -print0)

Proof of concept

$ tree test/
test/
├── sub1
│   ├── sub1_0_days_old.txt
│   ├── sub1_1_days_old.txt
│   ├── sub1_2_days_old.txt
│   ├── sub1_3_days_old.txt
│   └── sub1\ 4\ days\ old\ with\ spaces.txt
├── sub2\ with\ spaces
│   ├── sub2_0_days_old.txt
│   ├── sub2_1_days_old.txt
│   ├── sub2_2_days_old.txt
│   └── sub2\ 3\ days\ old\ with\ spaces.txt
└── tld_0_days_old.txt

2 directories, 10 files
$ ./keepNewest.sh test/ 2
$ tree test/
test/
├── sub1
│   ├── sub1_0_days_old.txt
│   └── sub1_1_days_old.txt
├── sub2\ with\ spaces
│   ├── sub2_0_days_old.txt
│   └── sub2_1_days_old.txt
└── tld_0_days_old.txt

2 directories, 5 files
SiegeX
  • 135,741
  • 24
  • 144
  • 154
1

As an extension to the answer by flohall. If you want to remove all folders except the newest three folders use the following:

find . -maxdepth 1 -mindepth 1 -type d -printf '%T@\t%p\n' |
 sort -t $'\t' -g | 
 head -n -3 | 
 cut -d $'\t' -f 2- |
 xargs rm -rf

The -mindepth 1 will ignore the parent folder and -maxdepth 1 subfolders.

flohall
  • 967
  • 10
  • 19
tutak
  • 1,120
  • 1
  • 15
  • 28
0

This uses find instead of ls with a Schwartzian transform.

find . -type f -printf '%T@\t%p\n' |
sort -t $'\t' -g |
tail -3 |
cut -d $'\t' -f 2-

find searches the files and decorates them with a time stamp and uses the tabulator to separate the two values. sort splits the input by the tabulator and performs a general numeric sort, which sorts floating point numbers correctly. tail should be obvious and cut undecorates.

The problem with decorations in general is to find a suitable delimiter, which is not part of the input, the file names. This answer uses the NULL character.

Community
  • 1
  • 1
ceving
  • 21,900
  • 13
  • 104
  • 178
-2

Below worked for me:

rm -rf $(ll -t | tail -n +5 | awk '{ print $9}')

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Mayur Chavan
  • 833
  • 8
  • 14
  • This has a number of problems. `ll` is not a standard command, though it's frequently aliased to `ls -l` in beginner-oriented distros; but why request a long-form listing only to (not quite successfully) throw away the information provided by the long listing? Even for file names which don't contain whitespace, this has all the other problems of [parsing `ls`](http://mywiki.wooledge.org/ParsingLs) and, of course, `rm -rf` is not really at all correct for the use case in the question, and potentially quite hazardous here. – tripleee Sep 17 '21 at 06:39