24

Is there a faster way to remove a directory then simply submitting

rm -r -f *directory*

? I am asking this because our daily cross-platform builds are really huge (e.g. 4GB per build). So the harddisks on some of the machines are frequently running out of space.

This is namely the case for our AIX and Solaris platforms.

Maybe there are 'special' commands for directory remove on these platforms?

PASTE-EDIT (moved my own separate answer into the question):

I am generally wondering why 'rm -r -f' is so slow. Doesn't 'rm' just need to modify the '..' or '.' files to de-allocate filesystem entries.

something like

mv *directory* /dev/null

would be nice.

  • 2
    I'm not being specious; buy bigger hard disks. 4GB is not so big in the world of TB storage. – KevinDTimm Nov 25 '09 at 08:08
  • 1
    What filesystem is this? – Grzegorz Oledzki Nov 25 '09 at 08:31
  • I've heard of 'facetious' but 'specious' is a new one on me. +1 :p – mpen Nov 25 '09 at 08:32
  • >> GrzegorzOledzki wrote: What filesystem is this? on AIX it is 'jfs' on the Linux build machine it is 'ext3' on the Solaris build machine it is ???. I don't know. Have to check tomorrow. –  Nov 25 '09 at 20:32
  • @Vokuhila-Oliba: This is totally OT, but where does your name come from? I don't get it but think it's hilarious. Being german, I get the first part but not the second? – Pekka Dec 06 '09 at 00:14
  • @Pekka: VorneKurzHintenLang-OberlippenBart - see www.wikipedia.de ;-) –  Dec 06 '09 at 18:41

13 Answers13

24

For deleting a directory from a filesystem, rm is your fastest option. On linux, sometimes we do our builds (few GB) in a ramdisk, and it has a really impressive delete speed :) You could also try different filesystems, but on AIX/Solaris you may not have many options...

If your goal is to have the directory $dir empty now, you can rename it, and delete it later from a background/cron job:

mv "$dir" "$dir.old"
mkdir "$dir"
# later
rm -r -f "$dir.old"

Another trick is that you create a seperate filesystem for $dir, and when you want to delete it, you just simply re-create the filesystem. Something like this:

# initialization
mkfs.something /dev/device
mount /dev/device "$dir"


# when you want to delete it:
umount "$dir"
# re-init
mkfs.something /dev/device
mount /dev/device "$dir"
Balázs Pozsár
  • 1,679
  • 14
  • 11
  • 4
    "umount 'temp-fs' and re-create" is such a nice idea! I am accepting this as the best answer! –  Dec 02 '09 at 19:06
  • this also meshes well with using LVM to allocate a new filesystem from some free space and deallocate it afterwards. or to base your new filesystem on template using LVM's snapshot feature. – araqnid Apr 12 '10 at 12:31
20

I forgot the source of this trick but it works:

EMPTYDIR=$(mktemp -d)
rsync -r --delete $EMPTYDIR/ dir_to_be_emptied/
yegle
  • 5,795
  • 6
  • 39
  • 61
  • 1
    Source (maybe): http://www.quora.com/File-Systems/How-can-someone-rapidly-delete-400-000-files – ZelluX Jun 01 '13 at 17:17
  • 6
    I would love to understand *why* the rsync approach is so much faster – Rob Latham Nov 11 '13 at 19:43
  • 1
    I don't think it is faster, or if it is, it's highly dependent on filesystem and other factors. I made 1000000 files and used `rm -rf foo/`, took 3 minutes. Then I tried this, but killed it when it took over twice as long. – Izkata Jan 10 '14 at 20:36
  • @MarkLopes Try WayBackMachine from archive.org. – Alix Axel Sep 10 '14 at 07:22
  • See also here http://unix.stackexchange.com/questions/37329/efficiently-delete-large-directory-containing-thousands-of-files – Yo Ludke May 07 '15 at 12:56
  • http://unix.stackexchange.com/questions/37329/efficiently-delete-large-directory-containing-thousands-of-files – Peter Mar 01 '16 at 02:30
  • You can add this link for reference: https://web.archive.org/web/20130929001850/http://linuxnote.net/jianingy/en/linux/a-fast-way-to-remove-huge-number-of-files.html – cprakashagr Jul 17 '18 at 12:28
  • @RobLatham, you can refer this to get some numbers on why this works: https://serverfault.com/a/328305/1034612 – Vibhor Dube Jul 14 '23 at 11:21
6

On AIX at least, you should be using LVM, the logical volume manager. All our systems bundle all the physical hard drive into a single volume group and then create one big honkin' file system out of that.

That way, you can add physical devices to your machine at will and increase the size of your file system to whatever you need.

One other solution I've seen is to allocate a trash directory on each file system and use a combination of mv and a find cron job to tackle the space problem.

Basically, have a cron job that runs every ten minutes and executes:

rm -rf /trash/*
rm -rf /filesys1/trash/*
rm -rf /filesys2/trash/*

Then, when you want your specific directory on that file system recycled, use something like:

mv /filesys1/overnight /filesys1/trash/overnight

and, within the next ten minutes your disk space will start being recovered. The filesys1/overnight directory will immediately be available for use even before the trashed version has started being deleted.

It's important that the trash directory be on the same filesystem as the directory you want to get rid of, otherwise you have a massive copy/delete operation on your hands rather than a relatively quick move.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
6

rm -r directory works by recursing depth-first down through directory, deleting files, and deleting the directories on the way back up. It has to, since you cannot delete a directory that is not empty.

Long, boring details: Each file system object is represented by an inode in the file system, which has file system-wide, flat array of inodes.[1] If you just deleted directory without first deleting its children then the children would remain allocated, but without any pointers to them. (fsck checks for that kind of thing when it runs, since it represents file system damage.)

[1] That may not be strictly true for every file system out there, and there may be a file system that works the way you describe. It would possibly require something like a garbage collector. However, all the common ones I know of act like fs objects are owned by inodes, and directories are lists of name/inode number pairs.

Tommy McGuire
  • 1,223
  • 13
  • 16
3

If rm -rf is slow, perhaps you are using a "sync" option or similar, which is writing to the disk too often. On Linux ext3 with normal options, rm -rf is very quick.

One option for fast removal which would work on Linux and presumably also on various Unixen is to use a loop device, something like:

hole temp.img $[5*1024*1024*1024]  # create a 5Gb "hole" file
mkfs.ext3 temp.img
mkdir -p mnt-temp
sudo mount temp.img mnt-temp -o loop

The "hole" program is one I wrote myself to create a large empty file using a "hole" rather than allocated blocks on the disk, which is much faster and doesn't use any disk space until you really need it. http://sam.nipl.net/coding/c-examples/hole.c

I just noticed that GNU coreutils contains a similar program "truncate", so if you have that you can use this to create the image:

truncate --size=$[5*1024*1024*1024] temp.img

Now you can use the mounted image under mnt-temp for temporary storage, for your build. When you are done with it, do this to remove it:

sudo umount mnt-temp
rm test.img
rmdir mnt-temp

I think you will find that removing a single large file is much quicker than removing lots of little files!

If you don't care to compile my "hole.c" program, you can use dd, but this is much slower:

dd if=/dev/zero of=temp.img bs=1024 count=$[5*1024*1024]  # create a 5Gb allocated file
Sam Watkins
  • 7,819
  • 3
  • 38
  • 38
  • This sounds like a 'special' solution. But if it works on AIX/solaris I can give it a try. For Linux I do not have this problems since the harddisk is such big that it is not an issue. –  Nov 25 '09 at 20:43
  • you can also use `dd`'s "seek" command and a count of zero to create files with holes. – mihi Nov 25 '09 at 21:41
  • You can do this sort of thing on any OS that has support for loop devices or logical volume management - create a temporary file system in a file or on the LvM, do your work in it, then just delete the whole filesystem (remove the file), which should be almost instantaneous, like deleting a large DVD image or whatever from your disk. The example commands I gave are for Linux, but the same thing should be possible on any *nix works it's salt, maybe even on 'doze. – Sam Watkins Nov 26 '09 at 07:04
2

I think that actually there is nothing else than "rm -rf" as you quoted to delete your directories.

to avoid doing it manually over and over you can cron daily a script that recursively deletes all the build directories of your build root directory if they're "old enough" with something like :

find <buildRootDir>/* -prune -mtime +4 -exec rm -rf {} \;

(here mtime +4 indicates "any file older than 4 days)

Another way would be to configure your builder (if it allows such things) to crush the previous build with the current one.

Michael Zilbermann
  • 1,398
  • 9
  • 19
2

On Solaris, this is the fastest way I have found.

find /dir/to/clean -type f|xargs rm

If you have files with odd paths, use

find /dir/to/clean -type f|while read line; do echo "$line";done|xargs rm 
Roman C
  • 49,761
  • 33
  • 66
  • 176
smoltz
  • 21
  • 1
  • i agree, just tried the first command with xargs and it is WAY faster than anything else here, in particular when running on slow disks. – olivierg Nov 12 '18 at 13:34
2

Use perl -e 'for(<*>){((stat)[9]<(unlink))}' Please refer below link: http://www.slashroot.in/which-is-the-fastest-method-to-delete-files-in-linux

2

I was looking into this as well.

I had a dir with 600,000+ files.

rm * would fail, because there are too many entries.

find . -exec rm {} \; was nice, and deleting ~750 files every 5 seconds. Was checking the rm rate via another shell.

So, instead I wrote a short script to rm many files at once. Which obtained about ~1000 files every 5 seconds. The idea is to put as many files into 1 rm command as you can to increase the efficiency.

#!/usr/bin/ksh
string="";
count=0;
for i in $(cat filelist);do
    string="$string $i";
    count=$(($count + 1));
  if [[ $count -eq 40 ]];then
    count=1;
    rm $string
    string="";
  fi
done
Scott
  • 21
  • 1
1

Needed to delete 700 Gbytes from dozens of directories on AWS EBS 1 TB disk (ext3) before copying remainder to a new 200 Gbyte XFS volume. It was taking hours leaving that volume at 100%wa. Since the disk IO and server time are not free, this took only a fraction of a second per directory.

where /dev/sdb is an empty volume of any size

directory_to_delete=/ebs/var/tmp/

mount /dev/sdb $directory_to_delete

nohup rsync -avh /ebs/ /ebs2/

Mark306
  • 71
  • 2
  • 5
1

I coded a small Java application RdPro (Recursive Directory Purge tool) which is faster than rm. It also can remove target directories user specified under a root.Works for both Linux/Unix and Windows. It has both a command line version and a GUI version.

https://github.com/mhisoft/rdpro

Tony
  • 1,458
  • 2
  • 11
  • 13
0

I had to delete more than 3,00,000 files in windows. I had cygwin installed. Luckily i had all the primary directory in a database. Created a for loop and based on line entry and delete using rm -rf

gnuyoga
  • 51
  • 3
0

I just use find ./ -delete in the folder to empty, and it has deleted 620000 directories (total size) 100GB in arround 10 minutes.

Source : a comment in this site https://www.slashroot.in/comment/1286#comment-1286

Arzhh
  • 1,280
  • 1
  • 9
  • 6