-1

Code:

find . -type f -exec file -b -- {} \; | sort | uniq -c | \
  sort -r -n | awk '{$1=""; print $0;}'

Output:

 GIF image data, version 89a, 57 x 68
 GIF image data, version 89a, 8 x 8
 GIF image data, version 89a, 17 x 11
 PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced
 JPEG image data, JFIF standard 1.02, aspect ratio, density 100x100, segment length 16, baseline, precision 8, 100x457, frames 3
 JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=0], baseline, precision 8, 510x300, frames 3
 HTML document, UTF-8 Unicode text, with CRLF line terminators
 GIF image data, version 89a, 960 x 4
 GIF image data, version 89a, 46 x 42
 GIF image data, version 89a, 100 x 100
 Composite Document File V2 Document, Cannot read section info
 ASCII text, with CRLF line terminators

Desired output:

GIF image data, version 89a, 57 x 68
GIF image data, version 89a, 8 x 8
GIF image data, version 89a, 17 x 11
PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced
JPEG image data, JFIF standard 1.02, aspect ratio, density 100x100, segment length 16, baseline, precision 8, 100x457, frames 3
JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=0], baseline, precision 8, 510x300, frames 3
HTML document, UTF-8 Unicode text, with CRLF line terminators
GIF image data, version 89a, 960 x 4
GIF image data, version 89a, 46 x 42
GIF image data, version 89a, 100 x 100
Composite Document File V2 Document, Cannot read section info
ASCII text, with CRLF line terminators

Probably fairly easy, but I can't wrap my head around it -- how to remove the first leading space.

arved
  • 4,401
  • 4
  • 30
  • 53
Samuel Hulla
  • 6,617
  • 7
  • 36
  • 70

4 Answers4

0

Use sub() to remove the initial space.

find . -type f -exec file -b -- {} \; | sort | uniq -c | sort -r -n | awk '{$1=""; sub("^ ", ""); print $0;}'
Barmar
  • 741,623
  • 53
  • 500
  • 612
0

This is what you are looking for:

find . -type f -exec file -b -- {} \; | sort | uniq -c | sort -r -n | awk '{$1=""; print $0;}' | sed 's/ //'
0

you can replace awk with sed in this context

 ... | sort -nr | sed -E 's/ *[0-9]+ //'

removes the leading numbers (counts) with surrounding spaces.

karakfa
  • 66,216
  • 7
  • 41
  • 56
  • 2
    Correct in most cases. But if there were a million or more of one file type, the `sed` code the `'s/ +` should be `'s/ *`, since `uniq -c` would then output *0* leading spaces. – agc Mar 23 '17 at 12:10
  • good point. I didn't know. fixed. – karakfa Mar 23 '17 at 13:33
0

Method using sed at the end, (instead of awk):

find . -type f -exec file -b -- {} \; | sort | uniq -c | \
  sort -r -n | sed -E 's/^ *[0-9]+ //'

Note: Any code must allow for the output of uniq -c being right justified -- uniq -c prints 0-6 leading spaces, depending on the number of unique items. Example:

for f in 1 10 1000 100000 1000000 10000000 ; do yes "$f" | head -$f ; done | uniq -c
      1 1
     10 10
   1000 1000
 100000 100000
1000000 1000000
10000000 10000000
agc
  • 7,973
  • 2
  • 29
  • 50