133

I have a bash shell script that loops through all child directories (but not files) of a certain directory. The problem is that some of the directory names contain spaces.

Here are the contents of my test directory:

$ls -F test
Baltimore/  Cherry Hill/  Edison/  New York City/  Philadelphia/  cities.txt

And the code that loops through the directories:

for f in `find test/* -type d`; do
  echo $f
done

Here's the output:

test/Baltimore
test/Cherry
Hill
test/Edison 
test/New
York
City
test/Philadelphia

Cherry Hill and New York City are treated as 2 or 3 separate entries.

I tried quoting the filenames, like so:

for f in `find test/* -type d | sed -e 's/^/\"/' | sed -e 's/$/\"/'`; do
  echo $f
done

but to no avail.

There's got to be a simple way to do this.


The answers below are great. But to make this more complicated - I don't always want to use the directories listed in my test directory. Sometimes I want to pass in the directory names as command-line parameters instead.

I took Charles' suggestion of setting the IFS and came up with the following:

dirlist="${@}"
(
  [[ -z "$dirlist" ]] && dirlist=`find test -mindepth 1 -type d` && IFS=$'\n'
  for d in $dirlist; do
    echo $d
  done
)

and this works just fine unless there are spaces in the command line arguments (even if those arguments are quoted). For example, calling the script like this: test.sh "Cherry Hill" "New York City" produces the following output:

Cherry
Hill
New
York
City
John Bachir
  • 22,495
  • 29
  • 154
  • 227
MCS
  • 22,113
  • 20
  • 62
  • 76
  • re: edit, `list="$@"` completely discards the original value's list-ness, collapsing it to a string. Please follow the practices in my answer *exactly as given* -- such an assignment is not encouraged anywhere therein; if you want to pass a list of command-line arguments to a program, you should collect them into an array, and expand that array directly. – Charles Duffy Jun 05 '18 at 20:18

20 Answers20

111

First, don't do it that way. The best approach is to use find -exec properly:

# this is safe
find test -type d -exec echo '{}' +

The other safe approach is to use NUL-terminated list, though this requires that your find support -print0:

# this is safe
while IFS= read -r -d '' n; do
  printf '%q\n' "$n"
done < <(find test -mindepth 1 -type d -print0)

You can also populate an array from find, and pass that array later:

# this is safe
declare -a myarray
while IFS= read -r -d '' n; do
  myarray+=( "$n" )
done < <(find test -mindepth 1 -type d -print0)
printf '%q\n' "${myarray[@]}" # printf is an example; use it however you want

If your find doesn't support -print0, your result is then unsafe -- the below will not behave as desired if files exist containing newlines in their names (which, yes, is legal):

# this is unsafe
while IFS= read -r n; do
  printf '%q\n' "$n"
done < <(find test -mindepth 1 -type d)

If one isn't going to use one of the above, a third approach (less efficient in terms of both time and memory usage, as it reads the entire output of the subprocess before doing word-splitting) is to use an IFS variable which doesn't contain the space character. Turn off globbing (set -f) to prevent strings containing glob characters such as [], * or ? from being expanded:

# this is unsafe (but less unsafe than it would be without the following precautions)
(
 IFS=$'\n' # split only on newlines
 set -f    # disable globbing
 for n in $(find test -mindepth 1 -type d); do
   printf '%q\n' "$n"
 done
)

Finally, for the command-line parameter case, you should be using arrays if your shell supports them (i.e. it's ksh, bash or zsh):

# this is safe
for d in "$@"; do
  printf '%s\n' "$d"
done

will maintain separation. Note that the quoting (and the use of $@ rather than $*) is important. Arrays can be populated in other ways as well, such as glob expressions:

# this is safe
entries=( test/* )
for d in "${entries[@]}"; do
  printf '%s\n' "$d"
done
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • 1
    didn't know about that '+' flavor for -exec. sweet – Johannes Schaub - litb Nov 19 '08 at 05:27
  • 1
    tho looks like it can also, like xargs, only put the arguments at the end of the given command :/ that's bugged me sometimes – Johannes Schaub - litb Nov 19 '08 at 05:35
  • I think -exec [name] {} + is a GNU and 4.4-BSD extension. (At least, it doesn't appear on Solaris 8, and I don't think it was in AIX 4.3 either.) I guess the rest of us may be stuck with piping to xargs... – Michael Ratanapintha Nov 19 '08 at 06:00
  • 2
    I've never seen the $'\n' syntax before. How does that work? (I would have thought that either IFS='\n' or IFS="\n" would work, but neither does.) – MCS Nov 19 '08 at 14:50
  • 1
    @crosstalk it's definitely in Solaris 10, I just used it. – Nick Jan 19 '11 at 15:45
  • IFS=$'\n' works pretty well. probably, better to add a backup variable before doing this : IFSBACKUP=$IFS;IFS=$'\n';...;IFS=$IFSBACKUP – Fedir RYKHTIK Aug 31 '11 at 13:45
  • @MichaelRatanapintha It's standards-compliant, not an extension -- but a fairly new addition to the standard, and some platforms haven't caught up. – Charles Duffy Nov 21 '12 at 14:46
  • +1. Doesn't turning off globbing need a `set -f` rather than a `set +f`? – iruvar Aug 09 '13 at 14:58
  • @Fedir, no need to backup the old value when (as here) it's done inside a subshell; the value reverts as soon as the subshell exits. – Charles Duffy Aug 19 '14 at 00:35
  • @Fedir, ...the other thing about `IFSBACKUP=$IFS; : ...; IFS=$IFSBACKUP` is that IFS being unset and IFS being set to an empty value are two different things (the former defaults to `$' \t\n'` and the latter doesn't), and that code loses the distinction. I'd suggest `oIFS=${IFS:$' \t\n'}`; that way, when you set `IFS=$oIFS` later, you aren't changing behavior if the initial value was unset. – Charles Duffy Apr 20 '16 at 14:29
  • Gawd. What nastiness this shell stuff can quickly turn into. – Tom Russell May 24 '21 at 02:46
  • `find test -type d -exec echo '{}' +` appears to concatenate the filenames. Putting the output in a shell variable and iterating over the variable tends to confirm it. – Tom Russell May 24 '21 at 02:51
  • 1
    @TomRussel, the `echo` here is a placeholder to be substituted with your actual command -- the command you would be running inside your loop. It's not part of the answer itself. – Charles Duffy May 24 '21 at 12:58
28
find . -type d | while read file; do echo $file; done

However, doesn't work if the file-name contains newlines. The above is the only solution i know of when you actually want to have the directory name in a variable. If you just want to execute some command, use xargs.

find . -type d -print0 | xargs -0 echo 'The directory is: '
Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • No need for xargs, see find -exec ... {} + – Charles Duffy Nov 19 '08 at 05:53
  • 4
    @Charles: for large numbers of files, xargs is much more efficient: it only spawns one process. The -exec option forks a new process for each file, which can be an order of magnitude slower. – Adam Rosenfield Nov 19 '08 at 05:54
  • 1
    I like xargs more. These two essentially seem to do the same both, while xargs has more options, like running in parallel – Johannes Schaub - litb Nov 19 '08 at 05:57
  • 3
    Adam, no that '+' one will aggregate as many filenames as possible and then executes. but it will not have such neat functions as running in parallel :) – Johannes Schaub - litb Nov 19 '08 at 05:57
  • hands down, who will have \n in their dirnames anyway :p stone them ^^ – Johannes Schaub - litb Nov 19 '08 at 06:08
  • @litb - people trying to take advantage of security flaws in your code, for one. Assuming folks will do the sane or reasonable thing is dangerous. – Charles Duffy Oct 05 '09 at 03:22
  • 3
    Note that if you want to do something with the filenames, you're going to have to quote them. E.g.: `find . -type d | while read file; do ls "$file"; done` – David Moles Nov 14 '12 at 19:13
  • +1 for using xargs, which incurs significant performance benefits over using find -exec. Use of xargs batches filenames to single command executions and optionally supports execution in parallel over multiple cores. – Coder Guy Nov 13 '14 at 22:01
  • @JonathanNeufeld, batching is also provided by `find -exec {} +`, which has been part of the POSIX standard since 2006. – Charles Duffy Jun 27 '15 at 16:32
  • @JonathanNeufeld, ...which, incidentally, was pointed out by litb in a prior comment in this thread back in '08. – Charles Duffy Nov 02 '17 at 16:34
24

Here is a simple solution which handles tabs and/or whitespaces in the filename. If you have to deal with other strange characters in the filename like newlines, pick another answer.

The test directory

ls -F test
Baltimore/  Cherry Hill/  Edison/  New York City/  Philadelphia/  cities.txt

The code to go into the directories

find test -type d | while read f ; do
  echo "$f"
done

The filename must be quoted ("$f") if used as argument. Without quotes, the spaces act as argument separator and multiple arguments are given to the invoked command.

And the output:

test/Baltimore
test/Cherry Hill
test/Edison
test/New York City
test/Philadelphia
cbliard
  • 7,051
  • 5
  • 41
  • 47
  • thanks, this worked for the alias I was creating to list how much space each directory in the current folder is using, it was choking on some dirs with spaces in the previous incarnation. This works in zsh, but some of the other answers didn't: `alias duc='ls -d * | while read D; do du -sh "$D"; done;'` – Ted Naleid Jun 13 '11 at 17:52
  • 2
    If you are using zsh, you can also do this: `alias duc='du -sh *(/)'` – cbliard Jun 15 '11 at 15:49
  • @cbliard This is still buggy. Try running it with a filename with, say, a tab sequence, or multiple spaces; you'll note that it changes any of those to a single space, because you aren't quoting in your echo. And then there's the case of filenames containing newlines... – Charles Duffy Jul 21 '13 at 21:39
  • @CharlesDuffy I tried with tab sequences and multiple spaces. It works with quotes. I also tried with newlines and it does not work at all. I updated the answer accordingly. Thank you for pointing this out. – cbliard Jul 23 '13 at 11:37
  • 1
    @cbliard Right -- adding quotes to your echo command was what I was getting at. As for newlines, you can make that work by using find `-print0` and `IFS='' read -r -d '' f`. – Charles Duffy Jul 23 '13 at 11:54
  • @CharlesDuffy Thank you for the precision, though I'll keep the answer simple even if it is not fool-proof. – cbliard Jul 24 '13 at 08:08
7

This is exceedingly tricky in standard Unix, and most solutions run foul of newlines or some other character. However, if you are using the GNU tool set, then you can exploit the find option -print0 and use xargs with the corresponding option -0 (minus-zero). There are two characters that cannot appear in a simple filename; those are slash and NUL '\0'. Obviously, slash appears in pathnames, so the GNU solution of using a NUL '\0' to mark the end of the name is ingenious and fool-proof.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
7

You could use IFS (internal field separator) temporally using :

OLD_IFS=$IFS     # Stores Default IFS
IFS=$'\n'        # Set it to line break
for f in `find test/* -type d`; do
    echo $f
done

IFS=$OLD_IFS

<!>

amazingthere
  • 992
  • 8
  • 10
4

I use

SAVEIFS=$IFS
IFS=$(echo -en "\n\b")
for f in $( find "$1" -type d ! -path "$1" )
do
  echo $f
done
IFS=$SAVEIFS

Wouldn't that be enough?
Idea taken from http://www.cyberciti.biz/tips/handling-filenames-with-spaces-in-bash.html

murpel
  • 41
  • 1
  • great tip: that's very helpful for options to a command-line osascript (OS X AppleScript), where spaces split an argument into multiple parameters where only one is intended – tim Sep 06 '12 at 11:58
  • No, it's not enough. It's inefficient (due to the unnecessary use of `$(echo ...)`), doesn't handle filenames with glob expressions correctly, doesn't handle filenames which contain `$'\b'` or $'\n' characters correctly, and moreover converts multiple runs of whitespace into single whitespace characters on the output side due to incorrect quoting. – Charles Duffy Jul 21 '13 at 21:42
4

Don't store lists as strings; store them as arrays to avoid all this delimiter confusion. Here's an example script that'll either operate on all subdirectories of test, or the list supplied on its command line:

#!/bin/bash
if [ $# -eq 0 ]; then
        # if no args supplies, build a list of subdirs of test/
        dirlist=() # start with empty list
        for f in test/*; do # for each item in test/ ...
                if [ -d "$f" ]; then # if it's a subdir...
                        dirlist=("${dirlist[@]}" "$f") # add it to the list
                fi
        done
else
        # if args were supplied, copy the list of args into dirlist
        dirlist=("$@")
fi
# now loop through dirlist, operating on each one
for dir in "${dirlist[@]}"; do
        printf "Directory: %s\n" "$dir"
done

Now let's try this out on a test directory with a curve or two thrown in:

$ ls -F test
Baltimore/
Cherry Hill/
Edison/
New York City/
Philadelphia/
this is a dirname with quotes, lfs, escapes: "\''?'?\e\n\d/
this is a file, not a directory
$ ./test.sh 
Directory: test/Baltimore
Directory: test/Cherry Hill
Directory: test/Edison
Directory: test/New York City
Directory: test/Philadelphia
Directory: test/this is a dirname with quotes, lfs, escapes: "\''
'
\e\n\d
$ ./test.sh "Cherry Hill" "New York City"
Directory: Cherry Hill
Directory: New York City
Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151
  • 1
    Looking back on this -- there actually *was* a solution with POSIX sh: You could reuse the `"$@"` array, appending to it with `set -- "$@" "$f"`. – Charles Duffy Jun 27 '15 at 16:35
4

Why not just put

IFS='\n'

in front of the for command? This changes the field separator from < Space>< Tab>< Newline> to just < Newline>

oshunluvr
  • 41
  • 1
4
find . -print0|while read -d $'\0' file; do echo "$file"; done
Udo Held
  • 12,314
  • 11
  • 67
  • 93
Freakus
  • 41
  • 1
  • 1
    `-d $'\0'` is precisely the same as `-d ''` -- because bash uses NUL-terminated strings, the first character of an empty string is a NUL, and for the same reason, NULs can't be represented inside of C strings at all. – Charles Duffy Jul 21 '13 at 21:43
3

ps if it is only about space in the input, then some double quotes worked smoothly for me...

read artist;

find "/mnt/2tb_USB_hard_disc/p_music/$artist" -type f -name *.mp3 -exec mpg123 '{}' \;
hardbutnot
  • 31
  • 1
2

To add to what Jonathan said: use the -print0 option for find in conjunction with xargs as follows:

find test/* -type d -print0 | xargs -0 command

That will execute the command command with the proper arguments; directories with spaces in them will be properly quoted (i.e. they'll be passed in as one argument).

Community
  • 1
  • 1
Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
1

Had to be dealing with whitespaces in pathnames, too. What I finally did was using a recursion and for item in /path/*:

function recursedir {
    local item
    for item in "${1%/}"/*
    do
        if [ -d "$item" ]
        then
            recursedir "$item"
        else
            command
        fi
    done
}
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
  • 1
    Don't use the `function` keyword -- it makes your code incompatible with POSIX sh, but has no other useful purpose. You can just define a function with `recursedir() {`, adding the two parens and removing the function keyword, and this will be compatible with all POSIX-compliant shells. – Charles Duffy Jul 21 '13 at 21:44
1

Convert the file list into a Bash array. This uses Matt McClure's approach for returning an array from a Bash function: http://notes-matthewlmcclure.blogspot.com/2009/12/return-array-from-bash-function-v-2.html The result is a way to convert any multi-line input to a Bash array.

#!/bin/bash

# This is the command where we want to convert the output to an array.
# Output is: fileSize fileNameIncludingPath
multiLineCommand="find . -mindepth 1 -printf '%s %p\\n'"

# This eval converts the multi-line output of multiLineCommand to a
# Bash array. To convert stdin, remove: < <(eval "$multiLineCommand" )
eval "declare -a myArray=`( arr=(); while read -r line; do arr[${#arr[@]}]="$line"; done; declare -p arr | sed -e 's/^declare -a arr=//' ) < <(eval "$multiLineCommand" )`"

for f in "${myArray[@]}"
do
   echo "Element: $f"
done

This approach appears to work even when bad characters are present, and is a general way to convert any input to a Bash array. The disadvantage is if the input is long you could exceed Bash's command line size limits, or use up large amounts of memory.

Approaches where the loop that is eventually working on the list also have the list piped in have the disadvantage that reading stdin is not easy (such as asking the user for input), and the loop is a new process so you may be wondering why variables you set inside the loop are not available after the loop finishes.

I also dislike setting IFS, it can mess up other code.

Steve Zobell
  • 896
  • 10
  • 13
  • If you use `IFS='' read`, on the same line, the IFS setting is present only for the read command, and does not escape it. There's no reason to dislike setting IFS in this way. – Charles Duffy Jul 21 '13 at 21:47
1

Well, I see too many complicated answers. I don't want to pass the output of find utility or to write a loop , because find has "exec" option for this.

My problem was that I wanted to move all files with dbf extension to the current folder and some of them contained white space.

I tackled it so:

 find . -name \*.dbf -print0 -exec mv '{}'  . ';'

Looks much simple for me

Tebe
  • 3,176
  • 8
  • 40
  • 60
1
#!/bin/bash

dirtys=()

for folder in *
do    
 if [ -d "$folder" ]; then    
    dirtys=("${dirtys[@]}" "$folder")    
 fi    
done    

for dir in "${dirtys[@]}"    
do    
   for file in "$dir"/\*.mov   # <== *.mov
   do    
       #dir_e=`echo "$dir" | sed 's/[[:space:]]/\\\ /g'`   -- This line will replace each space into '\ '   
       out=`echo "$file" | sed 's/\(.*\)\/\(.*\)/\2/'`     # These two line code can be written in one line using multiple sed commands.    
       out=`echo "$out" | sed 's/[[:space:]]/_/g'`    
       #echo "ffmpeg -i $out_e -sameq -vcodec msmpeg4v2 -acodec pcm_u8 $dir_e/${out/%mov/avi}"    
       `ffmpeg -i "$file" -sameq -vcodec msmpeg4v2 -acodec pcm_u8 "$dir"/${out/%mov/avi}`    
   done    
done

The above code will convert .mov files to .avi. The .mov files are in different folders and the folder names have white spaces too. My above script will convert the .mov files to .avi file in the same folder itself. I don't know whether it help you peoples.

Case:

[sony@localhost shell_tutorial]$ ls
Chapter 01 - Introduction  Chapter 02 - Your First Shell Script
[sony@localhost shell_tutorial]$ cd Chapter\ 01\ -\ Introduction/
[sony@localhost Chapter 01 - Introduction]$ ls
0101 - About this Course.mov   0102 - Course Structure.mov
[sony@localhost Chapter 01 - Introduction]$ ./above_script
 ... successfully executed.
[sony@localhost Chapter 01 - Introduction]$ ls
0101_-_About_this_Course.avi  0102_-_Course_Structure.avi
0101 - About this Course.mov  0102 - Course Structure.mov
[sony@localhost Chapter 01 - Introduction]$ CHEERS!

Cheers!

Sony George
  • 558
  • 2
  • 8
  • 16
  • `echo "$name" | ...` doesn't work if `name` is `-n`, and how it behaves with names with backslash-escape sequences depend on your implementation -- POSIX makes behavior of `echo` in that case explicitly undefined (whereas XSI-extended POSIX makes expansion of backslash-escape sequences standard-defined behavior, and GNU systems -- including bash -- without `POSIXLY_CORRECT=1` break the POSIX standard by implementing `-e` (whereas the spec requires `echo -e` to print `-e` on output). `printf '%s\n' "$name" | ...` is safer. – Charles Duffy Aug 05 '15 at 16:32
0

I needed the same concept to compress sequentially several directories or files from a certain folder. I have solved using awk to parsel the list from ls and to avoid the problem of blank space in the name.

source="/xxx/xxx"
dest="/yyy/yyy"

n_max=`ls . | wc -l`

echo "Loop over items..."
i=1
while [ $i -le $n_max ];do
item=`ls . | awk 'NR=='$i'' `
echo "File selected for compression: $item"
tar -cvzf $dest/"$item".tar.gz "$item"
i=$(( i + 1 ))
done
echo "Done!!!"

what do you think?

user000001
  • 32,226
  • 12
  • 81
  • 108
Hìr0
  • 1
0
find Downloads -type f | while read file; do printf "%q\n" "$file"; done
Tunaki
  • 132,869
  • 46
  • 340
  • 423
0

just found out there are some similarities between my question and yours. Aparrently if you want to pass arguments into commands

test.sh "Cherry Hill" "New York City"

to print them out in order

for SOME_ARG in "$@"
do
    echo "$SOME_ARG";
done;

notice the $@ is surrounded by double quotes, some notes here

Community
  • 1
  • 1
Jeffrey04
  • 6,138
  • 12
  • 45
  • 68
-3

For me this works, and it is pretty much "clean":

for f in "$(find ./test -type d)" ; do
  echo "$f"
done
apaderno
  • 28,547
  • 16
  • 75
  • 90
  • 4
    But this is worse. The double-quotes around the find cause all path names to be concatenated as a single string. Change the _echo_ to an _ls_ to see the problem. – NVRAM Sep 19 '11 at 17:13
-4

Just had a simple variant problem... Convert files of typed .flv to .mp3 (yawn).

for file in read `find . *.flv`; do ffmpeg -i ${file} -acodec copy ${file}.mp3;done

recursively find all the Macintosh user flash files and turn them into audio (copy, no transcode) ... it's like the while above, noting that read instead of just 'for file in ' will escape.

  • 2
    The `read` after `in` is one more word in the list you're iterating over. What you've posted is a slightly broken version of what the asker had, which doesn't work. You may have intended to post something different, but it's probably covered by other answers here anyway. – Gilles 'SO- stop being evil' Feb 24 '12 at 17:49