1

I need to match files only with one specific extension under all nested directories, including the PWD, with BASH using "globbing".

From this Answer (here), I believe there may not be a way to do this using globbing.

tl;dr

I need:

  • A glob expression
  • To match any command where simple globs can be used (ls, sed, cp, cat, chown, rm, et cetera)
  • Mainly in BASH, but other shells would be interesting
  • Both in the PWD and all subdirectories recursively
  • For files with a specific extension

I'm using grep & ls only as examples, but I need a glob expression that applies to other commands also.

  • grep -r --include=GLOB is not a glob expression for, say, cp; it is a workaround specific to grep and is not a solution.
  • find is not a glob, but it may be a workaround for non-grep commands if there is no such glob expression. It would need | or while do;, et cetera.

Examples

Suppose I have these files, all containing "find me":

./file1.js
./file2.php
./inc/file3.js
./inc/file4.php
./inc.php/file5.js
./inc.php/file6.php

I need to match only/all .php one time:

./file2.php
./inc/file4.php
./inc.php/file6.php

Duplicates returned: shopt -s globstar; ... **/*.php

This changes the problem; it does not solve it.

Dup: ls

Before entering shopt -s globstar as a single command...

ls **/*.php returns:

inc/file4.php
inc.php/file5.js
inc.php/file6.php
  • file2.php does not return.

After entering shopt -s globstar as a single command...

ls **/*.php returns:

file2.php
inc/file4.php
inc.php/file6.php

inc.php:
file5.js
file6.php
  • inc.php/file6.php returns twice.

Dup: grep

Before entering shopt -s globstar as a single command...

grep -R "find me" **/*.php returns:

inc/file4.php: find me
inc.php/file6.php: find me
  • file2.php does not return.

After entering shopt -s globstar as a single command...

grep -R "find me" **/*.php returns:

file2.php: find me
inc/file4.php: find me
inc.php/file5.js: find me
inc.php/file6.php: find me
inc.php/file6.php: find me
  • inc.php/file6.php returns twice.
    • After seeing the duplicate seen from the ls output, we know why.

Current solution: faulty misuse of && logic

grep -r "find me" *.php && grep -r "find me" */*.php
ls -l *.php && ls -l */*.php
  • Please no! I fail here && so I never happen

Desired solution: single command via globbing

grep -r "find me" [GLOB]
ls -l [GLOB]

Insight from grep

grep does have the --include flag, which achieves the same result but using a flag specific to grep. ls does not have an --include option. This leads me to believe that there is no such glob expression, which is why grep has this flag.

Jesse
  • 750
  • 1
  • 9
  • 25
  • `-r` of grep will by default include the PWD. It looks like adding `--include` makes it somehow not match files in the current dir. I wouldn't expect that and it could be a bug … – knittl May 15 '22 at 08:28
  • @knittl `-r` includes *.php files in the PWD, but only subdirs that also match *.php, I want *.php files in both the PWD and all subdirs regardless of containing .php in the name. I still can't find a way. – Jesse May 15 '22 at 10:05
  • This is about specific file extensions, not *all files* as in the proposed [dup](https://stackoverflow.com/questions/4349082/match-all-files-under-all-nested-directories-with-shell-globbing). That solution applied here only changes the problem; it does not solve it. It could work for `... **/*`, such as addressed by "all files" in the proposed dup Question, but not with a specified file extension like *.php in my Question. – Jesse May 15 '22 at 10:50
  • I don't understand why the solution with `find -exec` doesn't help you – Fravadona May 15 '22 at 16:12
  • 2
    I don't understand why people think this is a programming problem, and it keeps getting closed as a duplicate. That said: on my installation (Ubuntu 20.04, grep (GNU grep) 3.4) `grep -R --include="*.php" "Starting" .` does exactly what you're asking, I can't imagine why it doesn't for you. – tink May 15 '22 at 16:38
  • 1
    [Please don't](https://meta.stackexchange.com/q/7046/248627) repeat [questions](https://stackoverflow.com/q/72245606/354577). You have edited your original question. Great! That puts it in a queue to be considered for reopening. It has already earned two votes to reopen. A third will do it. Posting the same question again is counterproductive. – ChrisGPT was on strike May 15 '22 at 17:18
  • Reopen, pls. How can I explain that globs, useful for multiple commands, with file extension, both PWD and recursive... are all criteria that haven't turned up in any of the proposed dups? – Jesse May 15 '22 at 17:46
  • @Chris, from the "close" instructions on [that post](https://stackoverflow.com/questions/72245606): *Your post has been associated with a similar question. If this question doesn’t resolve your question, ask a new one.* – Jesse May 15 '22 at 17:58
  • @sideshowbarker That is MY own Question about using `grep`. This is a Question about **globs**, which clearly says that **globs** don't seem able, so `grep` uses a **flag** instead of a **glob**. – Jesse May 16 '22 at 07:17

3 Answers3

3

With bash, you can first do a shopt -s globstar to enable recursive matching, and then the pattern **/*.php will expand to all the files in the current directory tree that have a .php extension.

zsh and ksh93 also support this syntax. Other commands that take a glob pattern as an argument and do their own expansion of it (like your grep --include) likely won't.

Shawn
  • 47,241
  • 3
  • 26
  • 60
  • For bash, `globstar` was introduced in bash 4.0 – Fravadona May 15 '22 at 08:51
  • @Fravadona `**/*.php` works with `ls **/*.php` without `shopt -s globstar` first, but with `shopt -s globstar`, a .php file in a .php directory gets listed twice. This is helpful, and may be the best answer, but it doesn't solve my problem because 1. it's not a single command, it's a setting, then 2. it returns duplicate items. But, thanks, upvoting. – Jesse May 15 '22 at 10:14
  • @JesseSteele if you have a directory ending in .php the glob will expand it and if it's an argument to ls, ls will list the contents of the directory... – Shawn May 15 '22 at 11:40
  • 1
    If you just want to display names, use `printf "%s\n" **/*.php` instead of `ls **/*.php` (or find. ls is so often misused... See also https://mywiki.wooledge.org/ParsingLs ) – Shawn May 15 '22 at 11:41
1

Suggesting different strategy:

Use explicit find command to build bash command(s) on the selected files using -printf option.

Inspect the command for correctness and run.

1. preparing bash commands on selected files

 find . -type f -name "*.php" -printf "cp %p ~/destination/ \n"

2. inspect the output, correct command, correct filter, test

cp ./file2.php ~/destination/
cp ./inc/file4.php ~/destination/
cp ./inc.php/file5.php ~/destination/

3. execute prepared find output

 bash <<< $(find . -type f -name "*.php" -printf "cp %f ~/destination/ \n")
Dudi Boy
  • 4,551
  • 1
  • 15
  • 30
  • So, in other words, you're recommending `find` as a way to work around this problem because it's possible that glob expressions can't do this. It doesn't solve my problem, but I'm upvoting because it does help explain what is going on and how files work. – Jesse May 15 '22 at 10:16
1

With shell globing it is possible to only get directories by adding a / at the end of the glob, but there's no way to exclusively get files (zsh being an exception)

Illustration:

With the given tree:

file.php
inc.php/include.php
lib/lib.php

Supposing that the shell supports the non-standard ** glob:

  • **/*.php/ expands to inc.php/

  • **/*.php expands to file.php inc.php inc.php/include.php lib/lib.php

  • For getting file.php inc.php/include.php lib/lib.php, you cannot use a glob.
    => with zsh it would be **/*.php(.)

Standard work-around (any shell, any OS)

The POSIX way to recursively get the files that match a given standard glob and then apply a command to them is to use find -type f -name ... -exec ...:

  • ls -l <all .php files> would be:
find . -type f -name '*.php' -exec ls -l {} +
  • grep "finde me" <all .php files> would be:
find . -type f -name '*.php' -exec grep "finde me" {} +
  • cp <all .php files> ~/destination/ would be:
find . -type f -name '*.php' -type f -exec sh -c 'cp "$@" ~/destination/' _ {} +

remark: This one is a little more tricky because you need ~/destination/ to be after the file arguments, and find's syntax doesn't allow find -exec ... {} ~/destination/ +

Fravadona
  • 13,917
  • 1
  • 23
  • 35
  • This helps with globbing knowledge in general, which I make room for, so I'm upvoting. But my Question wants globs that can go into any command directly. This is essentially a workaround that would need a `|`. It is useful within scope, but it's doesn't address the full Question, so I won't mark it "correct'. Just explaining. But, thanks for this, really. – Jesse May 15 '22 at 17:31
  • See my update for a direct answer to your question. btw, why would you need a `|` in the examples with `find`? – Fravadona May 15 '22 at 19:33
  • ...a `|` or `--exec` or `while do;`.... workaround is workaround. If it's not possible and this is the way, then that's right. – Jesse May 16 '22 at 01:54
  • You can remove your "Update" line ;-) – Jesse May 16 '22 at 01:55
  • The last example could be simplified to `-exec cp -t ~/destination/ {} +` for efficiency if you have GNU `cp`, but it is useful as a template for how to solve the broader problem of processing one file at a time with `-exec ... +` – tripleee May 16 '22 at 04:56
  • @tripleee Nice, I didn't know about `cp -t`, but do you say _"one file at a time"_? `-exec 'cp "$@" dir/' {} +` will dopy a bunch of them at a time – Fravadona May 16 '22 at 05:48
  • 1
    Yes, what I mean is that the `sh -c 'for ... done' _ {} ` loop demonstrates how to process one file at a time with a single subprocess. – tripleee May 16 '22 at 05:59