The shell passes a single argument vector (that is to say, a simple C array of strings) off to a program being run. This is an OS-level limitation: There exists no method to pass structured data between two programs (any two programs, written in any language!) in an argument list, except by encoding that structure in the contents of the members of this array of C strings.
Approach: Length Prefixes
If efficiency is a goal (both in terms of ease-of-parsing and amount of space used out of the ARG_MAX
limit on command-line and environment storage), one approach to consider is prefixing each array with an argument describing its length.
By providing length arguments, however, you can indicate which sections of that argument list are supposed to be part of a given array:
./myScript \
"${#array1[@]}" "${array1[@]}" \
"${#array2[@]}" "${array2[@]}" \
"${#array3[@]}" "${array3[@]}"
...then, inside the script, you can use the length arguments to split content back into arrays:
#!/usr/bin/env bash
array1=( "${@:2:$1}" ); shift "$(( $1 + 1 ))"
array2=( "${@:2:$1}" ); shift "$(( $1 + 1 ))"
array3=( "${@:2:$1}" ); shift "$(( $1 + 1 ))"
declare -p array1 array2 array3
If run as ./myScript 3 a b c 2 X Y 1 z
, this has the output:
declare -a array1='([0]="a" [1]="b" [2]="c")'
declare -a array2='([0]="X" [1]="Y")'
declare -a array3='([0]="z")'
Approach: Per-Argument Array Name Prefixes
Incidentally, a practice common in the Python world (particularly with users of the argparse
library) is to allow an argument to be passed more than once to amend to a given array. In shell, this would look like:
./myScript \
"${array1[@]/#/--array1=}" \
"${array2[@]/#/--array2=}" \
"${array3[@]/#/--array3=}"
and then the code to parse it might look like:
#!/usr/bin/env bash
declare -a args array1 array2 array3
while (( $# )); do
case $1 in
--array1=*) array1+=( "${1#*=}" );;
--array2=*) array2+=( "${1#*=}" );;
--array3=*) array3+=( "${1#*=}" );;
*) args+=( "$1" );;
esac
shift
done
Thus, if your original value were array1=( one two three ) array2=( aye bee ) array3=( "hello world" )
, the calling convention would be:
./myScript --array1=one --array1=two --array1=three \
--array2=aye --array2=bee \
--array3="hello world"
Approach: NUL-Delimited Streams
Another approach is to pass a filename for each array from which a NUL-delimited list of its contents can be read. One chief advantage of this approach is that the size of array contents does not count against ARG_MAX
, the OS-enforced command-line length limit. Moreover, with an operating system where such is available, the below does not create real on-disk files but instead creates /dev/fd
-style links to FIFOs written to by subshells writing the contents of each array.
./myScript \
<( (( ${#array1[@]} )) && printf '%s\0' "${array1[@]}") \
<( (( ${#array2[@]} )) && printf '%s\0' "${array2[@]}") \
<( (( ${#array3[@]} )) && printf '%s\0' "${array3[@]}")
...and, to read (with bash 4.4 or newer, providing mapfile -d
):
#!/usr/bin/env bash
mapfile -d '' array1 <"$1"
mapfile -d '' array2 <"$2"
mapfile -d '' array3 <"$3"
...or, to support older bash releases:
#!/usr/bin/env bash
declare -a array1 array2 array3
while IFS= read -r -d '' entry; do array1+=( "$entry" ); done <"$1"
while IFS= read -r -d '' entry; do array2+=( "$entry" ); done <"$2"
while IFS= read -r -d '' entry; do array3+=( "$entry" ); done <"$3"