1

I would like to gzip all files in a directory if the file name matches a regular expression. Is there a way I could do something similar to:

gzip \b[^2\W]{2,}\b

Right now, when I do that it gives me an error because it does not know that I want to match a regex.

tripleee
  • 175,061
  • 34
  • 275
  • 318
Nathan Dai
  • 392
  • 2
  • 11

2 Answers2

1
find -maxdepth 1 -regex '.*\b[^2[^_[:alnum:]]]{2,}\b.*' -exec gzip {} +

The -maxdepth 1 prevents find from traversing subdirectories, which is otherwise its default behavior and primary purpose.

The -regex argument needs to match the whole file name, so I added .* on both sides.

  • This is a good idea, but you messed up the syntax in a couple of places. The argument to `-exec` should not be `$1` (and anyway, basically [always quote file name variables](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable)) but `{} +`, and `find` will traverse subdirectories, so you will want to add a `-maxdepth 1` to prevent that. – tripleee Oct 04 '21 at 04:49
  • We are generally advised to avoid making code changes without leaving a comment. I'll be happy to change those things if you are unable to, with your permission. – tripleee Oct 04 '21 at 05:02
  • I see that this post was just edited. In both commands, I received this error: `FIND: Parameter format not correct` – Nathan Dai Oct 04 '21 at 05:44
  • @NathanDai What OS are you running? –  Oct 04 '21 at 05:55
  • Windows with Cygwin. That probably explains the error. – Nathan Dai Oct 04 '21 at 06:06
  • Yup. Had the same problem myself in school. I only run Linux at home... Unfortunately a reinstall is required to add packages to Cygwin. See [https://superuser.com/questions/304541/how-to-install-new-packages-on-cygwin] –  Oct 04 '21 at 06:13
1

It's not clear which shell you are asking about. Bash has extended glob patterns which however are still not regular expressions. For proper regex, you will want to try looping over the files:

pat='\b[^_[:alnum:]]{2,}\b'
for file in ./*; do
    if [[ "$file" =~ $pat ]]; then
        gzip "$file"
    fi
done

Bash does not support the Perl-compatible \W (which includes 2 anyway) so I switched to the POSIX character class [^_[:alnum:]] which is basically equivalent. Perhaps see also Bash Regular Expression -- Can't seem to match any of \s \S \d \D \w \W etc

In the general case, you can always use a separate regular expression tool.

printf '%s\0' ./* |
perl -0lne 'print if /\b[^2\W]{2,}\b/' |
xargs -0 gzip

The shenanigans with \0 bytes is to support arbitrary file names robustly (think file names with newlines in them, etc); see also http://mywiki.wooledge.org/BashFAQ/020

tripleee
  • 175,061
  • 34
  • 275
  • 318