18

I have a list of directories based on the results of running the "find" command in bash. As an example, the result of find are the files:

test/a/file
test/b/file
test/file
test/z/file

I want to sort the output so it appears as:

test/file
test/a/file
test/b/file
test/z/file

Is there any way to sort the results within the find command, or by piping the results into sort?

Ken
  • 1,498
  • 2
  • 12
  • 19

4 Answers4

25

If you have the GNU version of find, try this:

find test -type f -printf '%h\0%d\0%p\n' | sort -t '\0' -n | awk -F '\0' '{print $3}'

To use these file names in a loop, do

find test -type f -printf '%h\0%d\0%p\n' | sort -t '\0' -n | awk -F '\0' '{print $3}' | while read file; do
    # use $file
done

The find command prints three things for each file: (1) its directory, (2) its depth in the directory tree, and (3) its full name. By including the depth in the output we can use sort -n to sort test/file above test/a/file. Finally we use awk to strip out the first two columns since they were only used for sorting.

Using \0 as a separator between the three fields allows us to handle file names with spaces and tabs in them (but not newlines, unfortunately).

$ find test -type f
test/b/file
test/a/file
test/file
test/z/file
$ find test -type f -printf '%h\0%d\0%p\n' | sort -t '\0' -n | awk -F'\0' '{print $3}'
test/file
test/a/file
test/b/file
test/z/file

If you are unable to modify the find command, then try this convoluted replacement:

find test -type f | while read file; do
    printf '%s\0%s\0%s\n' "${file%/*}" "$(tr -dc / <<< "$file")" "$file"
done | sort -t '\0' | awk -F'\0' '{print $3}'

It does the same thing, with ${file%/*} being used to get a file's directory name and the tr command being used to count the number of slashes, which is equivalent to a file's "depth".

(I sure hope there's an easier answer out there. What you're asking doesn't seem that hard, but I am blanking on a simple solution.)

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • 1
    I'm using this command as part of my bash script. Preferably I was hoping I could do something along the lines of: 'for file in find ... | sort ... ` – Ken Jan 23 '13 at 23:18
  • @Ken See my edit. `find ... | while read file` should be preferred over `for file in $(find ...)`. The latter is not whitespace-safe, is slower, and can error out if there are too many file names to fit on the command line. – John Kugelman Jan 23 '13 at 23:38
  • @JohnKugelman and I would conjecture that `while read file; do ... ; done < <(find ... )` is preferred over all of them because process substitution does not create a sub-shell like a pipe does with an added bonus of preserving any vars set outside the loop. P.S. sorry for raising the dead on this one. – SiegeX May 06 '20 at 01:24
  • I'm finding this result (at least the 'convoluted replacement') will correctly sort by parent directory but the files within each parent directory are not sorted (works for the OPs example where every dir contains the single file `file` but not in the general case) – drobert Mar 09 '21 at 17:08
3
find test -type f -printf '%h\0%p\n' | sort | awk -F'\0' '{print $2}'

The result of find is, for example,

test/a'\0'test/a/file
test'\0'test/file
test/z'\0'test/z/file
test/b'\0'test/b/text file.txt
test/b'\0'test/b/file

where '\0' stands for null character.

These compound strings can be properly sorted with a simple sort:

test'\0'test/file
test/a'\0'test/a/file
test/b'\0'test/b/file
test/b'\0'test/b/text file.txt
test/z'\0'test/z/file

And the final result is

test/file
test/a/file
test/b/file
test/b/text file.txt
test/z/file

(Based on the John Kugelman's answer, with "depth" element removed which is absolutely redundant.)

1

If you want to sort alphabetically, the best way is:

find test -print0 | sort -z

(The example in the original question actually wanted files before directories, which is not the same and requires extra steps)

zeroimpl
  • 2,746
  • 22
  • 19
0

try this. for reference, it firsts sorts on the second field second char. which only exists on the file, and has a r for reverse meaning it is first, after that it will sort on the first char of the second field. [-t is field deliminator, -k is key]

find test -name file |sort -t'/' -k2.2r -k2.1

do a info sort for more info. there is a ton of different ways to use the -t and -k together to get different results.

WhyteWolf
  • 456
  • 2
  • 9