36

I am currently trying to a make a script that would grep input to see if something is of a certain file type (zip for instance), although the text before the file type could be anything, so for instance

something.zip
this.zip
that.zip

would all fall under the category. I am trying to grep for these using a wildcard, and so far I have tried this

grep ".*.zip"

But whenever I do that, it will find the .zip files just fine, but it will still display output if there are additional characters after the .zip so for instance .zippppppp or .zipdsjdskjc would still be picked up by grep. Having said that, what should I do to prevent grep from displaying matches that have additional characters after the .zip?

Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
lacrosse1991
  • 2,972
  • 7
  • 38
  • 47
  • i find it [better to use ripgrep](https://stackoverflow.com/a/30138655/274502) – cregox Aug 28 '20 at 12:19
  • @cregox, you might be on a system that does not allow you to install rip grep though. – Daniel L. VanDenBosch Apr 27 '21 at 16:40
  • @daniel yes. and many other possible error scenarios. i still find it better, though. – cregox May 10 '21 at 12:08
  • Why any of the answers posted here doesn't work with the jar command?? I'm trying to grep some files within a JAR file using this: `jar tf name-of-my-file.jar |` plus any of the given `grep` answers here but it returns nothing while it should... Any idea why? – Metafaniel Dec 30 '22 at 02:08

11 Answers11

85

Test for the end of the line with $ and escape the second . with a backslash so it only matches a period and not any character.

grep ".*\.zip$"

However ls *.zip is a more natural way to do this if you want to list all the .zip files in the current directory or find . -name "*.zip" for all .zip files in the sub-directories starting from (and including) the current directory.

Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
  • 3
    How about `grep "\.zip"` – Steve May 12 '16 at 03:07
  • 1
    @Steve the `\.zip$` uses the `$` to denote end of line. This means that even a file with ".zip" in the filename (which would be crazy) would not trigger the filter. The file *must* have a `.zip` extension to be caught by the filter. – Shrout1 Aug 13 '18 at 15:57
  • 2
    What is the purpose of the first dot in the grep command? – FlexMcMurphy Jan 06 '21 at 22:02
19

On UNIX, try:

find . -type f -name \*.zip
Eric Bolinger
  • 2,722
  • 1
  • 13
  • 22
Student
  • 325
  • 2
  • 9
7

You can also use grep to find all files with a specific extension:

find .|grep -e "\.gz$"

The . means the current folder. If you want to specify a folder other than the current folder, just replace the . with the path of the folder. Here is an example: Let's find all files that end with .gz and are in the folder /var/log

  find /var/log/ |grep -e "\.gz$"

The output is something similar to the following:

 ✘ ⚙> find /var/log/ |grep -e "\.gz$"

/var/log//mail.log.1.gz
/var/log//mail.log.0.gz
/var/log//system.log.3.gz
/var/log//system.log.7.gz
/var/log//system.log.6.gz
/var/log//system.log.2.gz
/var/log//system.log.5.gz
/var/log//system.log.1.gz
/var/log//system.log.0.gz
/var/log//system.log.4.gz

The $ sign says that the file extension is ending with gz

Stryker
  • 5,732
  • 1
  • 57
  • 70
6

I use this to get a listing of the file types inside a folder.

find . -type f | egrep -i -E -o "\.{1}\w*$" | sort -su

Outputs for example:

.DS_Store
.MP3
.aif
.aiff
.asd
.doc
.flac
.jpg
.m4a
.m4p
.m4r
.mp3
.pdf
.png
.txt
.wav
.wma
.zip

BONUS: with

find . -type f | egrep -i -E -o "\.{1}\w*$" | sort | uniq -c

You'll get the file count:

    106 .DS_Store
     35 .MP3
     89 .aif
      5 .aiff
    525 .asd
      1 .doc
     60 .flac
     48 .jpg
    149 .m4a
     11 .m4p
      1 .m4r
  12844 .mp3
      1 .pdf
      5 .png
      9 .txt
    108 .wav
     44 .wma
      2 .zip
index opout
  • 61
  • 1
  • 2
5

You need to do a couple of things. It should look like this:

grep '.*\.zip$'

You need to escape the second dot, so it will just match a dot, and not any character. Using single quotes makes the escaping a bit easier.

You need the dollar sign at the end of the line to indicate that you want the "zip" to occur at the end of the line.

Vaughn Cato
  • 63,448
  • 5
  • 82
  • 132
3
grep -r pattern --include="*.txt" /path/to/dir/
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
2

Try: grep -o -E "(\\.([A-z])+)+"

I used this to get multi-dotted/multiple extensions. So if the input was hello.tar.gz, then it would output .tar.gz. For single dotted, use grep -o -E "\\.([A-z])+$". Tested on Cygwin/MingW+MSYS.

dsrdakota
  • 2,415
  • 1
  • 15
  • 10
2

One more fix/addon of the above example:

# multi-dotted/multiple extensions
grep -oEi "(\\.([A-z0-9])+)+" file.txt

# single dotted
grep -oEi "\\.([A-z0-9])+$" file.txt

This will get file extensions like '.mp3' and etc.

browseman
  • 21
  • 1
2

Just reviewing some of the other answers. The .* isn't necessary, and if you're looking for a certain file extension, it's best to include -i so that it's case-insensitive; in case the file is HELLO.ZIP, for example. I don't think the quotes are necessary, either.

grep -i \.zip$
twasbrillig
  • 17,084
  • 9
  • 43
  • 67
  • 2
    This is the best answer in my opinion, since it uses the least amount of characters to get the desirable outcome, and its case-insensitive, which is important for a wildcard type of functionality. – Bartekus Feb 17 '19 at 23:21
2

If you just want to find in the current folder, why not with this simple command without grep ?

ls *.zip 
Ayyappa
  • 1,876
  • 1
  • 21
  • 41
0

Simply do :

grep ".*.zip$"

The "$" indicates the end of line

AaronO
  • 495
  • 3
  • 7
  • 2
    Note, this would include files such as `hello.unzip` or `hi.xzip`, or even `hellozip`. You should escape the second "." – twasbrillig Apr 23 '15 at 18:25