1

So I discovered that the most convenient way to pass named arguments to bash functions is via 'temporary env variables' like so: kwd_arg=1 foo. And I want to use this to pass array variables but apparently these two features don't mix as expected & I want to know how to use them together properly.

I've tried the same syntax without a function involved & without 'temporary assignment': both worked. However when used together the array arg is treated as a plain string.

# bash func accepting an env kwd arg
foo() {
    echo ${array_arg[0]}
    echo ${array_arg[1]}
}

# doesn't work, array arg treated as string
array_arg=(1 2) foo
# output:
# (1 2)
#

# when set globally works surprisingly
array_arg=(1 2)
foo
# output:
# 1
# 2

# works of course
echo ${array_arg[0]}
echo ${array_arg[1]}
# output: same as above
profPlum
  • 403
  • 4
  • 12
  • How are temporary environment variables the most convenient way to pass named arguments? – Benjamin W. Jul 22 '19 at 22:01
  • @BenjaminW. because `getopts` the only semi-viable alternative has very convoluted syntax & doesn't play nice with positional arguments mixed it – profPlum Jul 22 '19 at 22:02
  • The environment can't store arrays at all; it only stores strings. – Charles Duffy Jul 22 '19 at 22:04
  • If you look at `/proc/self/environ` with a hex editor, it'll be very obvious why that's the case; you have a NUL-delimited flat `key=value` file (well, memory region exported as a file, but it's all the same content). Array elements are themselves NUL delimited, so how would an array be represented there? – Charles Duffy Jul 22 '19 at 22:05
  • @CharlesDuffy: It appears that it can store arrays. As my example code demonstrates the viable access patterns of the variable change based on how I do this, despite an echo showing that the string value unchanged. Also I have no idea what you mean about 'how could they be represented?' but bash arrays *do* [exist](https://www.tldp.org/LDP/Bash-Beginners-Guide/html/sect_10_02.html) so they are somehow... – profPlum Jul 22 '19 at 22:15
  • 1
    @profPlum, bash arrays are not stored in the environment, they're stored in heap memory. The environment is a separate memory block allocated by the operating system, not `malloc()`ed by the running process. Please be specific about exactly which part of your example code shows bash arrays stored *in the environment* specifically. – Charles Duffy Jul 22 '19 at 22:15
  • 2
    @profPlum, assuming you're using a recent bash version, use a nameref: `foo() { local -n theArray=$1; echo "${theArray[0]}"; }` then initialize the array and pass the array *name* to the function: `array_arg=(1 2); foo array_arg` -- however, you must use a different variable name or you'll see a "circular reference" error. – glenn jackman Jul 22 '19 at 22:19
  • @glennjackman I was just about to suggest that. :) – Charles Duffy Jul 22 '19 at 22:19
  • 1
    (it's still not an "env var" in the literal sense of being a variable stored in the environment and thus readable by child processes, but insofar as the OP only cares about communicating with a different function in the same process, it should be good enough for the immediate use case). – Charles Duffy Jul 22 '19 at 22:20
  • @CharlesDuffy ya you're correct. Maybe I should avoid using arrays? For example do you know if it is possible to access a space delimited string as an array elegantly, sort'f like for each bash loops do? – profPlum Jul 22 '19 at 22:37
  • 1
    If your string is *genuinely* space-delimited, there's no reason not to just pass it that way; you can read it into an array (`read -r -a items <<<"$items_str"; for item in "${items[@]}"; do ...; done`) on the receiving end. (`for item in $items` is buggy for reasons touched on in [BashPitfalls #1](http://mywiki.wooledge.org/BashPitfalls#for_f_in_.24.28ls_.2A.mp3.29); those issues can be mitigated by turning off globbing before expansion). – Charles Duffy Jul 22 '19 at 22:39
  • @CharlesDuffy if my args will have no glob patterns in them is there any reason not to do something like: `arg="1 2" { arg=( $arg ); ... }`? – profPlum Jul 22 '19 at 23:34
  • @profPlum, do you know the active value of `IFS`? If some code runs `IFS=:` and leaves it set to that value, then `$arg` will behave quite unlike what you intend. See also [BashPitfalls #50](http://mywiki.wooledge.org/BashPitfalls#hosts.3D.28_.24.28aws_....29_.29). – Charles Duffy Jul 22 '19 at 23:37
  • @profPlum, ...moreover, the more assumptions you make about the environment, the more fragile your code becomes, and the more risk someone takes by reusing that code, or those idioms, elsewhere. Moreover, assumptions are sometimes wrong -- the worst data loss event I've been personally present for was caused by someone assuming files in a certain directory "couldn't ever" have non-hex digits in their names; it took a bad pointer in a C library used in a Python program (written by trusted staff!) to prove them wrong, but that single bad assumption wiped backups of customer-billing data. – Charles Duffy Jul 22 '19 at 23:37
  • I see your point, and I guess it is a matter of opinion & expectation of reuse but I would prefer the solution that vastly simplifies the syntax. Somethings in bash are just way too complicated and it makes me think that if you have to do something so complicated the language was probably designed originally to be used differently... – profPlum Jul 22 '19 at 23:55
  • The language was designed to be a strict superset of a 1991 specification that derives directly from a (commercially licensed) 1980s extension of a shell from the 1970s. If it were designed from scratch, you'd see easy things being easy; instead, you have *backwards-compatible* behaviors being easy to access, because they're the ones that got first pick at the space of available syntax; functionality added only after folks discovered problems with the traditional ways got shoehorned into places where they're harder to access, because the "good" (easy-to-write) syntax was already taken. – Charles Duffy Jul 23 '19 at 00:06
  • ...which is to say -- if you want to make accurate inferences about authorial intent, it helps to know who the authors were and what constraints they were working with at the time. – Charles Duffy Jul 23 '19 at 00:11
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/196892/discussion-between-profplum-and-charles-duffy). – profPlum Jul 23 '19 at 23:12
  • @profPlum, ...so, I went back to work without answering re: my thoughts on more modern languages that can replace backwards-compatible POSIX-y shells. I still stand by the recommendation of Go when your needs get substantial, but as something more lightweight, http://rash-lang.org/ is worth considering. – Charles Duffy Jul 24 '19 at 03:00
  • 1
    You also might consider [using Julia as a shell](https://docs.julialang.org/en/v1/manual/running-external-programs/index.html); particularly in your field, Julia is likely to be interesting, as [high-performance numeric computing](https://julialang.org/benchmarks/) is its primary target field. – Charles Duffy Jul 24 '19 at 03:18

1 Answers1

0

This does not appear to be a good way to pass arrays:

If foo() { declare -p FOO; } then

$ FOO=bar foo
declare -x FOO="bar"            # OK
$ FOO=(a b c) foo
declare -x FOO="(a b c)"        # not an array

but

$ export FOO=(a b c)
$ foo
declare -ax FOO=([0]="a" [1]="b" [2]="c")
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • 1
    Try `bash -c 'declare -p FOO'`, and you'll see that it wasn't actually exported at all. Which is what one would expect, since the environment only stores strings at the operating system level. – Charles Duffy Jul 22 '19 at 22:09
  • That the `-x` part, the variable's export attribute is set. – glenn jackman Jul 22 '19 at 22:10
  • 1
    Yes, the flag is set, but once again, *it's not actually exported*; a subprocess can't see it. See https://ideone.com/XR9tRL – Charles Duffy Jul 22 '19 at 22:11
  • 1
    Right, arrays can't live in the environment. – glenn jackman Jul 22 '19 at 22:16
  • Indeed, but the OP seems unclear on that point. – Charles Duffy Jul 22 '19 at 22:17
  • I'm sorry could you clarify your answer? I'm not familiar with the declare command... I'm also not following your 'if then but' logic. – profPlum Jul 22 '19 at 22:18
  • 1
    At a bash prompt, type `help declare`. `declare -p name` just dumps the contents of the variable name, along with it's attributes. The demonstration is to show that you can't put arrays in the environment, only plain "scalar" values. – glenn jackman Jul 22 '19 at 22:21
  • @glennjackman: so are you saying then that they are stored elsewhere in my example then get mangled somehow when passed into the env? And if so do you have an example of a good alternative to pass arrays similarly? – profPlum Jul 22 '19 at 22:30
  • I don't understand your question: *what* are stored elsewhere? I think you're asking why you get the "mangled" output `"(1 2)"` when you call `array_arg=(1 2) foo`? You'd have to look at the bash source code to see exactly what's happening. Clearly the first step is that bash does parse it as an array -- we can see that because the output contains a space but you didn't quote the value. But why it stores the scalar value `"(1 2)"` instead of `"1 2"` or `"([0]=1 [1]=2)"` or something else I don't know. Perhaps the POSIX standard says something about it. – glenn jackman Jul 23 '19 at 01:59