26

According to this reference sheet on hyperpolyglot.org, the following syntax can be used to set an array.

i=(1 2 3)

But I get an error with dash which is the default for /bin/sh on Ubuntu and should be POSIX compliant.

# Trying the syntax with dash in my terminal
> dash -i
$ i=(1 2 3)
dash: 1: Syntax error: "(" unexpected
$ exit

# Working fine with bash
> bash -i
$ i=(1 2 3)
$ echo ${i[@]}
1 2 3
$ exit

Is the reference sheet misleading or erroneous?
If yes, what would be the correct way to define an array or a list and be POSIX compliant?

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
zoom
  • 1,686
  • 2
  • 16
  • 27
  • 7
    There are no arrays in POSIX. If you look closely, that is in reference to a *literal*. That entire section of **hyperpolyglot.org** is just `flat wrong` (probably done by M$). – David C. Rankin Feb 13 '16 at 22:18
  • I do not understand what the sheet means by literal in this context, the section of the table is called `resizable arrays`. But even if it was not about array, it should execute correctly according to the reference sheet. But you are true, the important part is that there is no concept of arrays. – zoom Feb 13 '16 at 22:26
  • Thanks about that clarification, I was considering relying on this sheet, I will look for another then. – zoom Feb 13 '16 at 22:28
  • 3
    Ouch, that's a terrible "reference". It has non-POSIX `function` keyword, magic `$RANDOM` and `echo -n`, recommends `trap exit ERR` rather than a more useful `trap 'exit 1' ERR`, and is extremely reckless with quoting. Not recommended. – Toby Speight Jan 16 '20 at 10:26

4 Answers4

29

Posix does not specify arrays, so if you are restricted to Posix shell features, you cannot use arrays.

I'm afraid your reference is mistaken. Sadly, not everything you find on the internet is correct.

rici
  • 234,347
  • 28
  • 237
  • 341
  • I read that too, but I needed a confirmation. The sheet seemed so detailed and precise that I was confused – zoom Feb 13 '16 at 22:24
  • 2
    @zoom Vendors of swampland in Florida also offer detailed and precise survey reports. – rici Feb 13 '16 at 22:28
  • 1
    Yes, but a common confusion is when a shell like `bash` is running in POSIX mode, e.g. with `--posix` invocation, or `#!/bin/sh` shebang, then it will quietly understand this non-POSIXism. – Jack Wasey Mar 07 '21 at 11:18
  • This is only partially true. POSIX does specify the list of arguments, which can be used as `$@` which can be adjusted with `shift` and `set` as noted in [my answer](https://stackoverflow.com/a/75441572/519360). – Adam Katz Feb 13 '23 at 22:00
26

As said by rici, dash doesn't have array support. However, there are workarounds if what you're looking to do is write a loop.

For loop won't do arrays, but you can do the splitting using a while loop + the read builtin. Since the dash read builtin also doesn't support delimiters, you would have to work around that too.

Here's a sample script:

myArray="a b c d"

echo "$myArray" | tr ' ' '\n' | while read item; do
  # use '$item'
  echo $item
done

Some deeper explanation on that:

  • The tr ' ' '\n' will let you do a single-character replace where you remove the spaces & add newlines - which are the default delim for the read builtin.

  • read will exit with a failing exit code when it detects that stdin has been closed - which would be when your input has been fully processed.

  • Since echo prints an extra newline after its input, that will let you process the last "element" in your array.

This would be equivalent to the bash code:

myArray=(a b c d)

for item in ${myArray[@]}; do
  echo $item
done

If you want to retrieve the n-th element (let's say 2-th for the purpose of the example):

myArray="a b c d"

echo $myArray | cut -d\  -f2 # change -f2 to -fn
Karim Alibhai
  • 346
  • 3
  • 4
  • 1
    Thanks! The last code snippet was just what I was looking for :) – kaiya Dec 15 '19 at 16:13
  • 2
    The issue with this is that it tries to store several separate strings inside one single string. As soon as you want to store strings that contains whitespace characters, you will fail to retrieve them properly. You would have to use some other safe delimiter other than space, and then implement your own parsing of the string. – Kusalananda Dec 16 '19 at 09:48
  • if the array contains file path then space may be used in any file name but you may use other symbol e.g. pipe `|`. So instead of `myArray="a b c d"` you can try `myArray="a|b|c|d"` and then change `tr ' ' '\n'` to `tr '|' '\n'`. I tested and it works fine – Sergey Ponomarev Jun 22 '20 at 12:25
  • `shellcheck` seems to suggest that it would be beneficial to have `while read -r item; do` to prevent the mangling of any backslash characters. – Abhishek Chakravarti Jul 11 '21 at 06:41
  • Why are you using the external command `tr` for this? You're already losing spacing, why not just use `for item in $myArray` (note the lack of quotes)? If you want to preserve spaces, put it in a function and locally change `$IFS` to the desired delimiter (like `local IFS='|'`) ... or use `$@` (see [my answer](https://stackoverflow.com/a/75441572/519360)). – Adam Katz Feb 13 '23 at 22:06
17

It is true that the POSIX sh shell does not have named arrays in the same sense that bash and other shells have, but there is a list that sh shells (as well as bash and others) could use, and that's the list of positional parameters.

This list usually contains the arguments passed to the current script or shell function, but you can set its values with the set built-in command:

#!/bin/sh

set -- this is "a list" of "several strings"

In the above script, the positional parameters $1, $2, ..., are set to the five string shown. The -- is used to make sure that you don't unexpectedly set a shell option (which the set command is also able to do). This is only ever an issue if the first argument starts with a - though.

To e.g. loop over these strings, you can use

for string in "$@"; do
    printf 'Got the string "%s"\n' "$string"
done

or the shorter

for string do
    printf 'Got the string "%s"\n' "$string"
done

or just

printf 'Got the string "%s"\n' "$@"

set is also useful for expanding globs into lists of pathnames:

#!/bin/sh

set -- "$HOME"/*/

# "visible directory" below really means "visible directory, or visible 
# symbolic link to a directory".

if [ ! -d "$1" ]; then
    echo 'You do not have any visible directories in your home directory'
else
    printf 'There are %d visible directories in your home directory\n' "$#"

    echo 'These are:'
    printf '\t%s\n' "$@"
fi

The shift built-in command can be used to shift off the first positional parameter from the list.

#!/bin/sh

# pathnames
set -- path/name/1 path/name/2 some/other/pathname

# insert "--exclude=" in front of each
for pathname do
    shift
    set -- "$@" --exclude="$pathname"
done

# call some command with our list of command line options
some_command "$@"

Kusalananda
  • 14,885
  • 3
  • 41
  • 52
  • Is there something like an `unshift` in POSIX? Can you use `$@` as a stack somehow? – Lassi Oct 15 '20 at 06:23
  • 6
    @Lassi To add at the start: `set -- "$item" "$@"`. To add at the end: `set -- "$@" "$item"`. You can't easily delete the last element of `$@`, but you can delete the first with `shift`. A stack could be implemented by pushing/popping the first element. – Kusalananda Oct 15 '20 at 06:28
  • Deleting the last element of `$@` is only mildly cumbersome. I've added an [answer](https://stackoverflow.com/a/75441572/519360) that demonstrates `shift`, `unshift`, `push`, `pop` (which performs that deletion), and other array functions for `$@`. – Adam Katz Feb 13 '23 at 22:13
2

You can use the argument list $@ as an array in POSIX shells

It's trivial to initialize, shift, unshift, and push:

# initialize $@ containing a string, a variable's value, and a glob's matches
set -- "item 1" "$variable" *.wav
# shift (remove first item, accepts a numeric argument to remove more)
shift
# unshift (prepend new first item)
set -- "new item" "$@"
# push (append new last item)
set -- "$@" "new item"

Here's a pop implementation:

# pop (remove last item, store it in $last)
i=0
for last in "$@"; do 
  if [ $((i+=1)) = 1 ]; then set --; fi  # increment $i. first run: empty $@
  if [ $i = $# ]; then break; fi         # stop before processing the last item
  set -- "$@" "$last"                    # add $a back to $@
done
echo "$last has been removed from ($*)"

($* joins the contents of $@ with $IFS, which defaults to a space character.)

Iterate through the $@ array and modify some of its contents:

i=0
for a in "$@"; do 
  if [ $((i+=1)) = 1 ]; then set --; fi  # increment $i. first run: empty $@
  a="${a%.*}.mp3"       # example tweak to $a: change extension to .mp3
  set -- "$@" "$a"      # add $a back to $@
done

Refer to items in the $@ array:

echo "$1 is the first item"
echo "$# is the length of the array"
echo "all items in the array (properly quoted): $@"
echo "all items in the array (in a string): $*"
[ "$n" -ge 0 ] && eval "echo \"the ${n}th item in the array is \$$n\""

(eval is dangerous, so I've ensured $n is a number before running it)

There are a few ways to set $last to the final item of a list without popping it:
with a function:

last_item() { shift $(($# - 1)) 2>/dev/null && printf %s "$1"; }
last="$(last_item "$@")"

... or with an eval (safe since $# is always a number):

eval last="\$$#"

... or with a loop:

for last in "$@"; do true; done

⚠️ Warning: Functions have their own $@ arrays. You'll have to pass it to the function, like my_function "$@" if read-only or else set -- $(my_function "$@") if you want to manipulate $@ and don't expect spaces in item values.

If you need to handle spaces in item values, it becomes much more cumbersome:

# ensure my_function() returns each list item on its own line
i=1
my_function "$@" |while IFS= read line; do
  if [ $i = 1 ]; then unset i; set --; fi
  set -- "$@" "$line"
done

This still won't work with newlines in your items. You'd have to escape them to another character (but not null) and then escape them back later. See "Iterate through the $@ array and modify some of its contents" above. You can either iterate through the array in a for loop and then run the function, then modify the variables in a while IFS= read line loop, or just do it all in a for loop without a function.

Adam Katz
  • 14,455
  • 5
  • 68
  • 83
  • `for i in "$@"; do ...; done` can be shortened into `for i do ...; done`. This relieves the user from remembering to quote `$@`. – Kusalananda Feb 14 '23 at 06:45
  • @Kusalananda – Most of the quotes you added to my answer were unnecessary (numbers never need to be quoted; if you change `$IFS` to include a digit, you're not going to like the fallout). You _removed_ quotes from an instance that needed them. I have reverted most of your changes. Yes, `for i do …; done` is shorter and POSIX-compliant, but I do not consider it to be intuitive. – Adam Katz Feb 14 '23 at 16:18
  • Quotes are not needed on the right-hand side of assignments since the shell won't perform splitting or globbing there. If you remove quoting on expansions just because they contain digits, you'd better add that you assume `IFS` can never contain digits or just explicitly reset `IFS` to its default value. You gain very little by refusing to quote those expansions. – Kusalananda Feb 14 '23 at 17:26