How to find all file extensions recursively from a directory?

Question

What command, or collection of commands, can I use to return all file extensions in a directory (including sub-directories)?

Right now, I'm using different combinations of ls and grep, but I can't find any scalable solution.

score 141 · Accepted Answer · edited Nov 26 '14 at 20:50

141

How about this:

find . -type f -name '*.*' | sed 's|.*\.||' | sort -u

edited Nov 26 '14 at 20:50

Martin Tournoij

26,737
24
105
146

answered Feb 14 '11 at 23:03

thkala

84,049
23
157
201

find [this directory] (files) (matching any name with an extension) | use sed to substitute anything preceding a period with nothing | sort with unique flag – Matthew Apr 13 '18 at 12:44
But this does'nt go into sub-directories. – 25b3nk Mar 28 '19 at 13:08
1

@BhaskarChakradhar Yes it does. What makes you think it doesn't? – Michael May 02 '19 at 16:19
thank you, this is very useful, i'm using this on chromium source code directory and got thousands of file extensions, many of them are actually files without file extension, is there anyway to ignore all files without file extension? – jerry Mar 06 '21 at 08:32
1

This also gets all "dot files", like e.g. `.ctags`, which is usually not what you want. – DevSolar Mar 29 '22 at 12:26
Can I suggest changing the " -name '*.* '" to "-name '[^.]*.*' as the former also picks up lots of "invisible" files such has temporary edits produced by xemacs? – Simon F Aug 16 '22 at 09:31
@DevSolar to exclude dot files you can indicate you want at least one chat before the dot using the question mark, ``-name '?*.*'``. Glob matching is less powerful than regex but much faster. – Meitham Oct 10 '22 at 07:16

score 10 · Answer 2 · answered May 16 '18 at 02:13

10

list all extensions and their counts of current and all sub-directories

ls -1R | sed 's/[^\.]*//' | sed 's/.*\.//' | sort | uniq -c

answered May 16 '18 at 02:13

mindon

318
4
13

score 8 · Answer 3 · answered Oct 12 '12 at 22:42

8

find . -type f | sed 's|.*\.||' | sort -u

Also works on mac.

answered Oct 12 '12 at 22:42

marcosdsanchez

2,529
2
17
20

This solution doesn't ensure all files listed _have_ extensions, so files without them aren't fixed by sed and are treated _as_ extensions. – Matthew Apr 13 '18 at 12:47

score 3 · Answer 4 · answered Sep 16 '20 at 02:13

Another one, similar to others but only uses two programs (find and awk)

find ./ -type f -name "*\.*" -printf "%f\n" | awk -F . '!seen[$NF]++ {print $NF}'

-type f restricts it to just files, not directories

-name "*\.*" ensures the filename has a . in it.

-printf "%f\n" prints just the filename, not the path to the filename.

-F . makes awk utilize a period as the field separator.

$NF is the last field, separated by periods.

!seen[$NF]++ evaluates to true the first time an extension is encountered, and false every other time it is encountered.

print $NF prints the extension.

kurumi · Answer 5 · 2011-02-15T02:01:43.200

1

if you are using Bash 4+

shopt -s globstar
for file in **/*.*
do
  echo "${file##*.}
done

Ruby(1.9+)

ruby -e 'Dir["**/*.*"].each{|x|puts x.split(".")[-1]}' | sort -u

edited Feb 15 '11 at 02:01

answered Feb 15 '11 at 01:15

kurumi

25,121
5
44
52

For me using `MSYS2`, the pattern `"${file##*.}"` will only print the final part of extensions with two dots (for example it only prints `.gz` when the extension is `.tar.gz`). The pattern `"${file#*.}` prints every part of the extension. – Alex Hall Aug 30 '20 at 01:39

score 0 · Answer 6 · answered Jun 27 '14 at 13:19

0

Boooom another:

find * | awk -F . {'print $2'} | sort -u

answered Jun 27 '14 at 13:19

ackuser

5,681
5
40
48

1

`echo 'gniourf.tar.gz' | awk -F . {'print $2'}` gives `tar` and `echo 'one.two.three.pdf' | awk -F . {'print $2'}` gives `two`. Are you sure your approach is the good one? – gniourf_gniourf Jun 27 '14 at 13:22
I think the above solution is a simple solution, here I put another find . -type f -name "*.*" | awk -F. '!a[$NF]++{print $NF}' . I don't think with a simple commands you can't get every type of file. As you said before there are some problems parsing every row, so in this case I am sure is better to use some scripts in python, perl or similar in which you won't have this problem. Anyway I put a simple solution, if you now the extension of the files you can filter with a grep like | grep 'txt\|png\|pdf'. Thanks – ackuser Jun 30 '14 at 09:20

TimeDelta · Answer 7 · 2014-10-09T21:12:11.160

0

ls -1 | sed 's/.*\.//' | sort -u

Update: You are correct Matthew. Based on your comment, here is an updated version:

ls -R1 | egrep -C 0 "[^\.]+\.[^\./:]+$" | sed 's/.*\.//' | sort -u

edited Oct 09 '14 at 21:12

answered Oct 04 '14 at 00:27

TimeDelta

401
3
13

1

This has two problems. First it only works for a flat directory, but misses subdirectories. Secondly, it includes all files without extensions in the output. – Matthew Oct 09 '14 at 13:11
[Don't parse the output of `ls`](http://mywiki.wooledge.org/ParsingLs), especially when it's useless. – gniourf_gniourf Jul 03 '15 at 17:14
You really should use ripgrep instead of egrep if you have time to install it first: https://github.com/BurntSushi/ripgrep and the updated command would be: `ls -R1 | rg -C 0 "[^\.]+\.[^\./:]+$" | sed 's/.*\.//' | sort -u` I get 10x at least improvement for huge folders. – james-see Dec 29 '18 at 22:24

score 0 · Answer 8 · answered Feb 10 '15 at 10:07

0

I was just quickly trying this as I was searching Google for a good answer. I am more Regex inclined than Bash, but this also works for subdirectories. I don't think includes files without extensions either:

ls -R | egrep '(\.\w+)$' -o | sort | uniq -c | sort -r

answered Feb 10 '15 at 10:07

Mehcs85

37
2
5

1

[Don't parse the output of `ls`](http://mywiki.wooledge.org/ParsingLs), especially when it's useless. – gniourf_gniourf Jul 03 '15 at 17:15

score 0 · Answer 9 · answered Feb 15 '11 at 14:12

Yet another solution using find (that should even sort file extensions with embedded newlines correctly):

# [^.]: exclude dotfiles
find . -type f -name "[^.]*.*" -exec bash -c '
  printf "%s\000" "${@##*.}"
' argv0 '{}' + |
sort -uz | 
tr '\0' '\n'

How to find all file extensions recursively from a directory?

9 Answers9

Linked