59

I'm trying to use GNU find to find only the directories that contain no other directories, but may or may not contain regular files.

My best guess so far has been:

find dir -type d \( -not -exec ls -dA ';' \)

but this just gets me a long list of "."

Thanks!

Thomas G Henry LLC
  • 10,887
  • 8
  • 30
  • 32
  • 1
    When using -exec, the {} argument is expanded to the path of the currently inspected filesystem object (file / directory / ...). So you should have used the following command to print the directories : find dir -type d \\( -not -exec ls -dA {} \; \\) – Sylvain Defresne Nov 24 '10 at 17:47
  • 2
    Same question on Super User: [Using “find” to list only directories with no more childs](http://superuser.com/questions/195879/using-find-to-list-only-directories-with-no-more-childs) – Gilles 'SO- stop being evil' Nov 24 '10 at 23:25
  • See also: [List all leaf subdirectories in linux](http://stackoverflow.com/questions/1574403/list-all-leaf-subdirectories-in-linux). – Dennis Williamson Nov 25 '10 at 02:19
  • 1
    Since this question ranks highly in search, see https://stackoverflow.com/a/9418016/315024 which gives the simplest answer: `find -type d -empty` – Walf Nov 23 '21 at 06:54

9 Answers9

101

You can use -links if your filesystem is POSIX compliant (i.e. a directory has a link for each subdirectory in it, a link from its parent and a link to itself, thus a count of 2 links if it has no subdirectories).

The following command should do what you want:

find dir -type d -links 2

However, it does not seems to work on Mac OS X (as @Piotr mentioned). Here is another version that is slower, but does work on Mac OS X. It is based on his version, with a correction to handle whitespace in directory names:

find . -type d -exec sh -c '(ls -p "{}"|grep />/dev/null)||echo "{}"' \;
tripleee
  • 175,061
  • 34
  • 275
  • 318
Sylvain Defresne
  • 42,429
  • 12
  • 75
  • 85
  • @SylvainDefresne, any idea if it will work on NetApp file system over NFS? – oz123 Aug 12 '13 at 08:57
  • I just used the first version (-links 2) on an NetApp over NFS. So the answer is yes. – Paul Holbrook Aug 01 '14 at 12:37
  • 2
    Similarly, the simple soln doesn't seem to work in Cygwin (windows 7), but the extended OSx version does – Eric B. Jan 05 '15 at 19:16
  • 2
    in my btrfs system directories have link count 1, so this doesn't work. – miguel.negrao May 19 '16 at 11:03
  • The replacement string `{}` should be single-quoted to `sh -c`, not double quoted, since filenames might contain characters treated specially under double quotes (such as `$`). – eigengrau Sep 16 '17 at 16:10
  • I have found a portable solution (both mac and linux) that doesn't involve `find -exec` (see my answer). That was far too slow for me as it runs too many processes for each dir. – ahmet alp balkan May 27 '18 at 04:00
  • Even in 2019 Mac hasn't fixed this. – Sridhar Sarnobat Feb 18 '19 at 16:12
  • Unfortunately, this doesn't work on NTFS disks. Neither the 1st solution with `-links 2`, nor the second because there are directories with a `$` in the name (like `$RECYCLE.BIN`). But it's fine on ext[234] partitions. – mivk Mar 31 '20 at 19:50
  • [Never embed `{}` in the shell code!](https://unix.stackexchange.com/a/156010/108618) – Kamil Maciorowski May 21 '23 at 21:23
6

I just found another solution to this that works on both Linux & macOS (without find -exec)!

It involves sort (twice) and awk:

find dir -type d | sort -r | awk 'a!~"^"$0{a=$0;print}' | sort

Explanation:

  1. sort the find output in reverse order

    • now you have subdirectories appear first, then their parents
  2. use awk to omit lines if the current line is a prefix of the previous line

    • (this command is from the answer here)
    • now you eliminated "all parent directories" (you're left with parent dirs)
  3. sort them (so it looks like the normal find output)
  4. Voila! Fast and portable.
ahmet alp balkan
  • 42,679
  • 38
  • 138
  • 214
  • The only problem with this ingenious/portable answer is that, as pointed out [here](https://stackoverflow.com/a/52467913/1982385) it will fail if any character in the folder name is a regex special character. I've made a small modification and posted my answer [here](https://stackoverflow.com/a/62632786/1982385). – Daniel Gray Jun 29 '20 at 06:55
  • This will not work if one directory starts with a substring of another. For example, if one leaf directory is called "foo", and another "foobar", this will only show "foobar". – Chris Down Nov 24 '20 at 12:46
  • for that matter you can use sed to append a '/' to the end of each line before awk and then remove them after awk – Nathaniel_Wu Oct 07 '21 at 05:33
3

@Sylvian solution didn't work for me on mac os x for some obscure reason. So I've came up with a bit more direct solution. Hope this will help someone:

find . -type d  -print0 | xargs -0 -IXXX sh -c '(ls -p XXX | grep / >/dev/null) || echo XXX' ;

Explanation:

  • ls -p ends directories with '/'
  • so (ls -p XXX | grep / >/dev/null) returns 0 if there is no directories
  • -print0 && -0 is to make xargs handle spaces in directory names
Piotr Czapla
  • 25,734
  • 24
  • 99
  • 122
  • Confused. `find -print0` and `xargs -0` are also not available out of the box on MacOS; but of course, you can avoid them both with `find -exec`, like Sylvain's updated answer demonstrates. – tripleee Sep 18 '22 at 09:04
  • I liked this solution. Seems very readable and a great alternative for cases when the `links 2` approach does not work. I did need to double quote the `XXX`s though. – user1593842 May 29 '23 at 20:31
2

I have some oddly named files in my directory trees that confuse awk as in @AhmetAlpBalkan 's answer. So I took a slightly different approach

  p=;
  while read c;
    do 
      l=${#c};
      f=${p:0:$l};
      if [ "$f" != "$c" ]; then 
        echo $c; 
      fi;
      p=$c; 
    done < <(find . -type d | sort -r) 

As in the awk solution, I reverse sort. That way if the directory path is a subpath of the previous hit, you can easily discern this.

Here p is my previous match, c is the current match, l is the length of the current match, f is the first l matching characters of the previous match. I only echo those hits that don't match the beginning of the previous match.

The problem with the awk solution offered is that the matching of the beginning of the string seems to be confused if the path name contains things such as + in the name of some of the subdirectories. This caused awk to return a number of false positives for me.

A.Ellett
  • 331
  • 2
  • 10
1

There is an alternative to find called rawhide (rh) that is much easier to use.

For filesystems other than btrfs:

rh 'd && nlink == 2'

For btrfs:

rh 'd && "[ `rh -red %S | wc -l` = 0 ]".sh'

A shorter/faster version for btrfs is:

rh 'd && "[ -z \"`rh -red %S`\" ]".sh'

The above commands search for directories and then list their sub-directories and only match when there are none (the first by counting the number of lines of output, and the second by checking if there is any output at all per directory).

For a version that works on all filesystems as efficiently as possible:

rh 'd && (nlink == 2 || nlink == 1 && "[ -z \"`rh -red %S`\" ]".sh)'

On normal (non-btrfs) filesystems, this will work without the need for any additional processes for each directory, but on btrfs, it will need them. This is probably best if you have a mix of different filesystems including btrfs.

Rawhide (rh) is available from https://raf.org/rawhide or https://github.com/raforg/rawhide. It works at least on Linux, FreeBSD, OpenBSD, NetBSD, Solaris, macOS, and Cygwin.

Disclaimer: I am the current author of rawhide.

raf
  • 43
  • 5
  • The `wc -l` variant looks suspicious; perhaps see also [useless use of `wc`](https://www.iki.fi/era/unix/award.html#wc) – tripleee Sep 18 '22 at 10:21
  • Yes. The wc can be avoided. That's what the second version demonstrates. Instead of wc counting the lines of rh output and the shell comparing that against zero, the shell just measures the length of any rh output. – raf Sep 20 '22 at 09:12
0

What about this one ? It's portable and it doesn't depend on finnicky linking counts. Note however that it's important to put root/folder without the trailing /.

find root/folder -type d | awk '{ if (length($0)<length(prev) || substr($0,1,length(prev))!=prev) print prev; prev=($0 "/") } END { print prev }'
DREV
  • 1
  • 1
0

Here is solution which works on Linux and OS X:

find . -type d -execdir bash -c '[ "$(find {} -mindepth 1 -type d)" ] || echo $PWD/{}' \; 

or:

find . -type d -execdir sh -c 'test -z "$(find "{}" -mindepth 1 -type d)" && echo $PWD/{}' \;
kenorb
  • 155,785
  • 88
  • 678
  • 743
0

This awk/sort pipe works a bit better than the one originally proposed in this answer, but is heavily based on it :) It will work more reliably regardless of whether the path contains regex special characters or not:

find . -type d | sort -r | awk 'index(a,$0)!=1{a=$0;print}' | sort

Remember that awk strings are 1-indexed instead of 0-indexed, which might be strange if you're used to working with C-based languages.

If the index of the current line in the previous line is 1 (i.e. it starts with it) then we skip it, which works just like the match of "^"$0.

Daniel Gray
  • 1,697
  • 1
  • 21
  • 41
  • This will fail to match directories whose name is a prefix of a sibling directory. E.g. if you have paths `/a/a` and `/a/ab`, then `/a/a` will not be reported. – Ruud Aug 24 '20 at 20:18
  • How about using -depth option of find like this: ```find . -depth -type d | awk 'index(a,$0)!=1{a=$0;print}'``` – Chubler_XL Aug 01 '22 at 23:59
  • This will obviously fail on directory names which contain newlines. – tripleee Sep 18 '22 at 09:00
0

My 2 cents on this problem:

#!/bin/bash
(
while IFS= read -r -d $'\0' directory
do
    files=$(ls -A "$directory" | wc -l)
    if test $files -gt 0 
    then
        echo "$directory"
    fi
done < <(find . -type d -print0)
) | sort | uniq

It uses a subshell to capture output from the run, and lists directories which have files.

Niloct
  • 9,491
  • 3
  • 44
  • 57
  • I don't believe the subshell is necessary or useful, actually. The `-print0` option to `find` is a GNU extension, and not properly portable. [Using `ls` in scripts](http://mywiki.wooledge.org/ParsingLs) is always suspicious. – tripleee Sep 19 '22 at 14:26
  • This was on https://mywiki.wooledge.org/BashFAQ/020. – Niloct Sep 19 '22 at 14:54
  • The subshell is needed to capture the output, otherwise all directories are echoed as listed. – Niloct Sep 19 '22 at 14:55
  • Redirecting `done` into the pipe suffices eminently for that. – tripleee Sep 19 '22 at 15:12