0

About split a string into an array we have two scenarios:

if the string has empty spaces how a separator, according with the following post:

So If I use:

string="Hello Unix World"
array1=($string)
echo ${array1[@]}
echo "size: '${#array1[@]}'"

read -a array2 <<< $string
echo ${array2[@]}
echo "size: '${#array2[@]}'"

The output is:

Hello Unix World
size: '3'
Hello Unix World
size: '3'

Both approaches works as expected.

Now, if the string has something different than an empty space how a separator, according with the following post:

So If I use:

path="/home/human/scripts"
IFS='/' read -r -a array <<< "$path"

echo "Approach 1"
echo ${array[@]}
echo "size: '${#array[@]}'"

echo "Approach 2"
for i in "${array[@]}"; do
   echo "$i"
done

echo "Approach 3"
for (( i=0; i < ${#array[@]}; ++i )); do
    echo "$i: ${array[$i]}"
done

It prints:

Approach 1
home human scripts    <--- apparently all is ok, but see the line just below!
size: '4'
Approach 2
                      <--- an empty element
home
human
scripts
Approach 3
0:                    <--- confirmed, the empty element
1: home
2: human
3: scripts

Why appears that empty element? How fix the command to avoid this situation?

Manuel Jordan
  • 15,253
  • 21
  • 95
  • 158
  • The empty element produces no visible output with `echo $@`. `printf '%s\n' "${array[@]}"` would have given you the same output for approach 1 and approach 2. – chepner Jan 07 '22 at 23:19
  • 2
    a delimiter has a 'before' field and an 'after' field; when `/` is the delimiter this - `/home/human/scripts` - has 4x fields ... the empty string before the first `/`, and the other 3 fields that you know about; if the input was `//home/human/scripts/` you'd have 6 fields ... 2 empty strings + `home` + `human` + `scripts` + 1 empty string – markp-fuso Jan 07 '22 at 23:20
  • 1
    I should say, too, that if you had proper quoted the array in approach 1, `echo "${array[@]}"` would have output ` home human scripts`, with the space separating the empty string from `home` visible. Without quotes, `echo` really does only get three arguments; the unquoted empty string "vanishes" during word-splitting. – chepner Jan 07 '22 at 23:25
  • I need iterate the array. Therefore is important avoid the empty element, and the string is a path for either MacOS or Linux. – Manuel Jordan Jan 07 '22 at 23:27
  • 1
    @markp-fuso huge thanks for explanation. – Manuel Jordan Jan 07 '22 at 23:39
  • 1
    fwiw, you'll see the same behavior with other tools/commands that use delimiters (eg, `cut` and `awk`) or that use `IFS` to split the input (eg, `read` - see konsolebox's example) – markp-fuso Jan 07 '22 at 23:42
  • @markp-fuso just in case about `the empty string before the first /` part of your first comment - in the `path="/home/human/scripts"` declaration - it starts with `/home` and not with ` /home` – Manuel Jordan Jan 07 '22 at 23:56
  • fwiw, I typed `/home` (no leading space) ... the formatted text just looks like it has a leading space (like in your comment, too :-) – markp-fuso Jan 08 '22 at 00:02
  • @chepner correct about the second comment - it was tested. In general the confusion came from about why appeared the empty space if there is no an empty space before of the first `/` character. – Manuel Jordan Jan 08 '22 at 00:06
  • @markp-fuso in that case `//home/human/scripts/` returns 5, not 6. I tested. – Manuel Jordan Jan 08 '22 at 00:11
  • 1
    The empty string is not the same as a space. The leading space in the output of `echo "${array[@]}"` comes from `echo` itself, not the array. – chepner Jan 08 '22 at 01:14
  • 1
    `array1=($string)` is buggy **in general**. See [BashPitfalls #50](http://mywiki.wooledge.org/BashPitfalls#hosts.3D.28_.24.28aws_....29_.29) – Charles Duffy Jan 08 '22 at 01:56

2 Answers2

1

Your string split into 4 parts: an empty one, and the three words.

path="/home/human/scripts"
IFS='/' read -r -a array <<< "$path"
declare -p array

Output:

declare -a array=([0]="" [1]="home" [2]="human" [3]="scripts")

There are many ways to fix it. One is to delete the empty values. Another is to exclude the beginning slash before splitting.

for i in "${!array[@]}"; do
    [[ ${array[i]} ]] || unset 'array[i]'
done

Or

IFS='/' read -r -a array <<< "${path#/}"

The first one is adaptable to path forms were slashes are repeated not only in the beginning.

konsolebox
  • 72,135
  • 12
  • 99
  • 105
1

Just to pile on with what is really a formatted comment:

The manual (in 3.5.7 Word Splitting) describes IFS as a "field terminator":

The shell treats each character of $IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators.

For IFS=/ read -a fields <<< "/home/user", the first field is the empty string terminated by the first slash.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352