0

In bash, I think I know how to iterate over an array of strings containing spaces:

~ $ arr=( "/home/user/Images/three parts dirname/ccc.png" "/home/user/Images/three parts dirname/bbb.png" "/home/user/Images/three parts dirname/aaa.png" )
~ $ for i in "${arr[@]}"; do echo "$i"; done

/home/user/Images/three parts dirname/ccc.png
/home/user/Images/three parts dirname/bbb.png
/home/user/Images/three parts dirname/aaa.png

I want to do something similar in a script that uses find. So I'm doing...

~ $ d="/home/user/Images/three parts name"
~ $ arr=( $(find -L "$d" -maxdepth 1 -iname '*.jp*g' -o -iname '*.png' -printf '"%p" ') )

...because the command find -L "$d" -maxdepth 1 -iname '*.jp*g' -o -iname '*.png' -printf '"%p" ' gives exacltly:

"/home/user/Images/three part name/ccc.png" "/home/user/Images/three part name/bbb.png" "/home/user/Images/three part name/aaa.png" 

The problem is that in this case I have the result below:

~ $ for i in "${arr[@]}"; do echo "$i"; done

"/home/user/Images/three
parts
name/ccc.png"
"/home/user/Images/three
parts
name/bbb.png"
"/home/user/Images/three
parts
name/aaa.png"

So I cannot iterate over those files successfully. I know I can avoid spaces in dirs and files names (and I barely have any) but I wanted the script to work anyway, just in case.

What I mean is: why If I use find to define the array, the 1st item, i.e., is

"/home/user/Images/three

and not

/home/user/Images/three parts dirname/ccc.png ?

It seems that with find the spaces in the dir names are "hard coded" and/or the double quotes are part of the strings (separated by spaces) themselves.

EDIT here for clarity:
I accepted the answer using find. It has to be corrected adding escaped brackets though, according to stackoverflow.com/a/6957310/1865860. I also wanted to further process the find results, so I had to rely on stackoverflow.com/a/11789688/1865860.

The final output looks like:

readarray -d $'\0' TOT_IMAGES < <(find -L "$i" -maxdepth 1 \( -iname '*.jp*g' -o -iname '*.png' -o -iname '*.gif' -o -iname '*.tif*' \) -print0);
IFS=$'\n';
SORTED_IMAGES=($(sort <<<"${TOT_IMAGES[*]}")); 
TOP4_IMAGES=($(head -n4 <<<"${SORTED_IMAGES[*]}"));
unset IFS
dentex
  • 3,223
  • 4
  • 28
  • 47
  • 2
    See: why you [DRLWF](https://mywiki.wooledge.org/DontReadLinesWithFor) – Jetchisel Dec 26 '22 at 20:29
  • 6
    The double quotes printed with `find -printf` are literal, not syntactical. See [How can I store the "find" command results as an array in Bash](https://stackoverflow.com/q/23356779/3266847) for a robust way to store the result from `find` in an array. – Benjamin W. Dec 26 '22 at 20:37
  • @M.NejatAydin : This is nearly correct, but the OP wants to use `-iname`. With your approach you would therefore have to consider the upper/lower case variants, which makes the pattern slightly more complicated. In addition you would have to turn on `nullglob`, for the case that one of your wildcard patterns does not produce any match. – user1934428 Dec 27 '22 at 12:54
  • @M.NejatAydin : Good point!! I forgot about this handy option, since I rarely run this problem. Perhaps you want to write the whole command then as an answer? – user1934428 Dec 27 '22 at 13:11
  • When you do your `arr=(.....)`, word-split occurs and the output of `find` is broken up by the spaces. Is there a particular reason why you want to use `find`? – user1934428 Dec 27 '22 at 15:17

2 Answers2

2

You could do this with pure bash, without using the find:

shopt -s nullglob nocaseglob
arr=("$d"/{*.png,*.jp*g})
M. Nejat Aydin
  • 9,597
  • 1
  • 7
  • 17
  • To be picky, in accordance with the OP's code, the pattern should be `*.jp*g`, not `*.jpg`. Most likely, he wants also to catch files ending in `.jpeg` and `.jpxxg` and so on. – user1934428 Dec 27 '22 at 14:06
  • This should obtain the wanted final result, but the question asks specifically about using `find`, and their particular attempts to do so. – John Bollinger Dec 27 '22 at 14:31
  • @JohnBollinger : Sure, but there is already a `find` solution, and the alternative given by M.NejatAydin is IMO cleaner. I consider both answers useful. The OP can decide, which version he wants to accept. – user1934428 Dec 27 '22 at 15:00
  • Yes, @user1934428, the code presented here is cleaner than using `find` for those cases it handles. But **that does not make this answer responsive to the question that was actually posed**. – John Bollinger Dec 27 '22 at 15:05
  • This is neat (...upvoted), but I accepted the answer with `find`, just because it's what I asked and also it may be expandable if I want to adapt the script to look in nested directories. – dentex Dec 27 '22 at 15:55
  • 1
    @dentex This is expandable too if you want to recurse directories: Just add the `globstar` option and `arr=("$d"/**/{*.png,*.jpg})` will do it. When you have `bash` (with a recent version), you rarely need the `find`, if all you want is to search based on filenames. – M. Nejat Aydin Dec 27 '22 at 16:12
1

You can use find with print0 option to have ASCII NUL character separator (instead of NL) and then use readarray

readarray -d $'\0' arr < <(find -L "$d" -maxdepth 1 -iname '*.jp*g' -o -iname '*.png' -print0)

so you don't have to worry about the spaces.

Diego Torres Milano
  • 65,697
  • 9
  • 111
  • 134
  • In which versions of Bash does this work? [The manual](https://www.gnu.org/software/bash/manual/bash.html#ANSI_002dC-Quoting) does not document `\0` as one of the escape sequences supported in a `$''` string, and embedding null characters in the values of shell variables is generally pretty questionable. Also, older Bash (version 4.3, say) does not recognize a `-d` option for `readarray`. – John Bollinger Dec 27 '22 at 14:25
  • However, if one is willing to assume -- not necessarily safely -- that paths will not contain newlines, then one can use more or less this approach with find's `-print` output instead of `-print0`. – John Bollinger Dec 27 '22 at 14:27
  • Answer accepted because it uses `find`. It has to be corrected adding escaped brackets though, according to https://stackoverflow.com/a/6957310/1865860 I also wanted to further process the `find` results, so I had to rely on https://stackoverflow.com/a/11789688/1865860 The final output looks like: `readarray -d $'\0' TOT_IMAGES < <(find -L "$i" -maxdepth 1 \( -iname '*.jp*g' -o -iname '*.png' -o -iname '*.gif' -o -iname '*.tif*' \) -print0); IFS=$'\n'; SORTED_IMAGES=($(sort <<<"${TOT_IMAGES[*]}")); TOP4_IMAGES=($(head -n4 <<<"${SORTED_IMAGES[*]}")); unset IFS` – dentex Dec 27 '22 at 16:00
  • BTW, `printf '%c' $'\0' | xxd` shows the NUL char, so the sequence works – Diego Torres Milano Dec 27 '22 at 19:08