2

I am using Bash to go through some directories and files. I am using find function as shown in the example and I need to exclude all directories (and all it includes) and files which name match some regular expression (which can be anything).

while read file
do 
    do_stuff "${file}" 
done <<< "$(find ${SOME_DIR}")

I already tried using -not -name args but that only excluded that one specific directory, and processed all in that directory.

Btw. I found some similar questions in here, but all are quite specific. This regular expression can be really anything.

example: REGEX='^bbb$'

dir structure: ./test

├── bbb

│   └── some file

├── -e

├── dir2

│   └── bbb

├── a

├── b

└── c

So here I would like find to exclude bbb directory and all thats in it and bbb file.

Another regex example: '^.b.$' -- all files and directories that has b in their name

Zeusko
  • 47
  • 1
  • 6
  • please add examples, show sample directory structure, whether they are nested or not, add sample filenames and explain which files are needed and which are not – Sundeep Mar 17 '17 at 14:44
  • @Sundeep added example – Zeusko Mar 17 '17 at 15:02
  • I didn't mean you need to draw structure itself... anyway, so in shown example, you want to exclude any file/dir named `bbb`.. that is clear... what about `some file` under `bbb` directory, that should also be excluded? – Sundeep Mar 17 '17 at 15:05
  • Yes, all that the directory contains, so also `some file` – Zeusko Mar 17 '17 at 15:08

1 Answers1

3

You have several options. Closest to what you asked, if you are using GNU find then you can use a negated -regex test to filter out the files you don't want to see. Since this matches against the whole path to each file (relative to one of the starting directories) you can write a regex that matches both paths ending with your file name and those having that name as an intermediate directory. For example,

find . -not -regex '\(.*/\)?bbb\(/.*\)?'

(Note that anchoring the pattern is unnecessary, as the test succeeds only if the pattern matches the whole path under consideration anyway.)

But better might be to use a negated filename test combined with the -prune action, something like this:

find . -not -name 'bbb' -o \( -prune -false \)

The -name test compares the base file name of each file considered with the specified shell pattern (glob), and the result is negated by the -not operator. The right-hand expression of the -o (logical or) operator is evaluated only if the left-hand expression evaluates to false, and in that case it performs the -prune action before ultimately evaluating to false itself. Thus, files with the given name are suppressed (by -false) and their descendants, if any, are not scanned at all (because of -prune). All of this is portable to any POSIX-conformant find.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • nice one, to clarify some points for OP, can you mention that -name specified is [glob](http://mywiki.wooledge.org/glob) and not regex? and possibly an example with `-regex` option which has syntax differences and requirement to match path, etc – Sundeep Mar 17 '17 at 15:15
  • so if I replace `-name` with `-regex`, it should work? – Zeusko Mar 17 '17 at 15:20
  • For the first part of the answer: so I should somehow work with the regex to make it match the whole path? And if so, is there an easy way? – Zeusko Mar 17 '17 at 15:24
  • @Zeusko see https://stackoverflow.com/questions/6844785/how-to-use-regex-with-find-command for examples with -regex – Sundeep Mar 17 '17 at 15:32
  • 1
    @Sundeep, thanks for your suggestions. I have implemented them. – John Bollinger Mar 17 '17 at 15:32
  • @Zeusko, you can use `-regex` to solve the problem (supposing you can rely on having GNU's version of `find`), but the semantics are not exactly analogous to those of `-name`. I have updated the answer with an example. Nevertheless, unless you need to support regexes that cannot easily be replaced by shell globs, I recommend instead the approach based on `-name` that I outlined. It's more portable and easier to read. – John Bollinger Mar 17 '17 at 15:35
  • Ok. Thank you for advice `-regex` will probably work for me. – Zeusko Mar 17 '17 at 15:41