Accessing shell array/hash environment variables from Ruby

Question

It appears that Ruby's ENV mechanism doesn't permit access to shell environment variables that are declared as arrays or associative arrays (hashes). (At least specific to the bash shell.) However, I cannot find this limitation documented anywhere.

Is this in fact a hard and intentional limitation, or is there actually a way to access structured shell environment variabled from within Ruby?

Environment variables can only be strings. Shell arrays are not environment variables. — Barmar, Jun 28 '23 at 17:44
Do you mean 'Environment variables in Ruby can only be strings. Shell array environment variables are not strings'? Because shell array variables are quite clearly environment variables from the POV of the shell and their access semantics. And Ruby's `ENV` says that an envariable value needs to respond to `#to_str` (IIRC).. Just trying to clarify. — RoUS, Jun 28 '23 at 17:51
Environment variables aren't Ruby-specific, they're a general OS feature, and it only supports strings. — Barmar, Jun 28 '23 at 17:52
See https://stackoverflow.com/questions/5564418/exporting-an-array-in-bash-script — Barmar, Jun 28 '23 at 17:52
Ahhh, I see what you mean, although I think it would be more clear if you had said, 'Bash array variables are a construct specific to bash, and not valid POSIX environment variables. This places a restriction on their [ex]portability.' Thank you for the speedy informative link! — RoUS, Jun 28 '23 at 18:08
I said "shell arrays", sorry if it wasn't clear. It's not specific to bash, it also applies to ksh, zsh, dash, even csh. — Barmar, Jun 28 '23 at 19:10
[This](https://pubs.opengroup.org/onlinepubs/9699919799/functions/setenv.html) is the POSIX specification for setting an environment variable. As you see, there is no provision for arrays. Some shells provide the facility to pass "function definitions" through the environment; this is done by - under the hood - turning the function into a string and having the child process recreate the function. You could employ a similar strategy for your case. — user1934428, Jun 29 '23 at 13:22

score 1 · Answer 1 · answered Jun 28 '23 at 18:14

1

As mentioned by @Barmar (and explained by his link to Exporting an array in bash script) this is not a deficiency in Ruby. Rather, it is a restriction in bash. POSIX-compliant envariables must be strings. The additional array and associative-array constructs (and presumably the numeric semantics as well) are bags-on-the-side specific to bash, and cannot be exported directly. There are numerous ways of attempting to retain the structure attributes through serialisation (see the link), but the basic answer appears to be: "If you want to do this, you're going to have to do it yourself. There's no built-in way of doing it."

answered Jun 28 '23 at 18:14

RoUS

1,888
2
14
29

It is not a restriction by bash, but by the operating system. – user1934428 Jun 29 '23 at 13:22
As for how to do it, I would recommend formatting the arrays as something Ruby can trivially deserialize like JSON. Alternatively, since bash and Ruby both have `eval` it wouldn't be too hard to split them up across multiple variables e.g. `A0`, `A1`, etc. – Max Jun 29 '23 at 13:38

score 0 · Answer 2 · answered Jun 28 '23 at 17:48

0

I'd imagine you could execute a shell command from within the Ruby script printenv or env and pipe the output to a subprocess to parse everything or what you need (may need to use the 'more' flag with the env command depending on what you're looking for and the OS).

Refs:

Operating with shell commands: How to call shell commands from Ruby
Piping subprocesses: Ruby pipes: How do I tie the output of two subprocesses together?

answered Jun 28 '23 at 17:48

fumacc-e

101
2

So this is an undocumented (except by inference) restriction in Ruby? It appears to apply to Python and Perl, as well, but I haven't investigated in depth. I wonder why this is so, other than perhaps dealing with parsing to/from the #to_s return for `ENV`-specific hash and array implementations.. It clearly isn't as simply nor obvious as it seemed to me at first blush. – RoUS Jun 28 '23 at 17:55

Todd A. Jacobs · Answer 3 · 2023-06-29T06:28:27.087

TL;DR

This issue is largely shell-specific, and is casued by the fact that most shells require environment variables to be strings rather than arrays. With many shells, that makes shell arrays inaccessible via Ruby's ENV module.

Exported arrays from Fish can be accessed through ENV as indexed arrays, but Fish currently has no explicit support for associative arrays. On the other hand, Bash supports both types of arrays but doesn't appear to export arrays to the environment at all even though the syntax allows for it. To make matters worse, export FOO=(foo bar baz) won't return a non-zero exit status. As a result, you can't directly access Bash arrays through Ruby's ENV module although workarounds exist.

Skip to the end if you just want to see my preferred solution using ARGV, but I think the other approaches I discuss first have their own merits and potential utility value. In other words, if you really need access through Ruby's ENV rather than ARGV, you may need to pass the definition of the array into your environment rather than accessing the shell array directly. That's why I talk through those approaches first.

Indexed Arrays in Fish

In the Fish shell, all variables are intrinsically arrays anyway, so you can do something like:

set -gx FOO foo bar baz
ruby -e 'p ENV["FOO"]; p ENV["FOO"].split.last'

and get sensible results, but it's essentially up to you to split the strings or understand how the values are encoded by such shells to find your preferred index or treat multi-word shellwords as separate array elements. Your question wasn't about the Fish shell, but I thought it provided a useful illustration of the fact that shell arrays are (by definition) handled in shells-specific ways, so you might need to take that into account if you must do this through environment variables.

For Bash and Zsh, there are definitely better approaches than relying on the direct export of a shell array to the environment, but in most cases you simply shouldn't rely on Ruby's ENV to access shell arrays from the environment since such variables often can't be directly exported in many Bourne-based shells.

Arrays from Ruby in Bash-Like Shells

Bash and Zsh provide declare -a, declare -A, and declare -p which can help up to a point, but still require you to work around the fact that shell arrays aren't usually exposed as normal strings in the shell's environment even when exported, and will therefore be inaccessible to Ruby's ENV.

For example, you could do something like the following in Bash/Zsh to essentially call declare to re-assign the definition of the array stored in FOO to another environment variable such as RUBY_FOO. This definition, unlike the array itself, can be exported to sub-shells and Ruby's ENV as an indirect way to leverage shell builtins and expansions to access your array elements from inside Ruby.

FOO=(foo bar baz)
export RUBY_FOO=$(declare -p FOO)
ruby -e 'puts %x(#{ENV["RUBY_FOO"]}; printf "%s\n" ${FOO[@]})'

This will correctly return "foo\nbar\nbaz\n". For more complex word-splitting, you can look into the Shellwords module or do some of your own parsing. As an example, consider this array that contains two words in every element, potentially subjecting the words to unexpected word splitting or expansions unless properly quoted.

FOO=("foo bar" "baz quux")
export RUBY_FOO=$(declare -p FOO)
ruby -e 'p %x(#{ENV["RUBY_FOO"]}; printf "%s\n" "${FOO[@]}").split("\n")'

This time, you'll correct get back ["foo bar", "baz quux"], which is pretty good considering the limitations of the approach. You could also manually parse the output of declare -p to convert things into a Ruby collection, but that seems like more trouble than it's worth when there's a much easier way!

Use Ruby's ARGV to Pass Array Elements as Arguments

Indexed Arrays

A much easier approach, regardless of your choice of shell, is to simply pass your indexed array as shell-quoted arguments to your script. For example:

FOO=("foo bar" "baz quux")
ruby -e 'ARGV.map { p _1 }' "${FOO[@]}"

This will print each of your array elements with minimal fuss, and no need for indirection, parsing, or anything other than the quoting needed to properly form the shell's array values in the first place.

Associative Arrays

Note that a very similar approach also works for shells that support associative arrays. For example, let's unset FOO and explicitly recast it as an associative array. Then we'll pass the keys and values into ARGV using shell expansions.

unset FOO
declare -A FOO=("foo bar" 200 "baz quux" 404)
ruby -e 'size = ARGV.count / 2
         p ARGV[...size].zip(ARGV[-(size)..]).to_h' \
    "${!FOO[@]}" "${FOO[@]}"

This returns a pretty Hash like {"foo bar"=>"200", "baz quux"=>"404"}, but that's really just an implementation detail chosen for this example. Passing the keys and values of the associative array is easy; it's the effort of splitting, joining, and formatting the associative array's keys and values of the shell's associative array in Ruby that makes it look harder than it is. Again, that's just an implementation detail.

All we've really had to do here is pass the associative keys from the shell's "${!FOO[@]}" expansion plus the stored values from the expansion of "${FOO[@]}". The rest is just zipping up the matching key/value pairs into sub-arrays, and then converting the result to a Hash!

The requirement that environment variables be plain strings is *not* a shell restriction, it's how they're implemented in the OS. Even in `fish`, it doesn't actually store arrays in env variables, it converts the exported arrays to plain strings and stores those as env variables. For example, `set -gx FOO "foo bar" "baz quux"` actually creates an env variable `FOO` with the value "foo bar baz quux" (with no distinction between spaces *between* elements vs spaces *within* elements). — Gordon Davisson, Jun 29 '23 at 08:08
@GordonDavisson That's hair-splitting, since Fish knows how the string is split into elements, and individual elements can be queried and subshells can inherit the exported values (which Bash and Zsh do not). If you want to improve the semantics of my statement please suggest alternative wording, but arguing that Fish can't export arrays *pragmatically* is demonstrably not true, regardless of the underlying storage and retrieval mechanism. *That* was the point of mentioning it; bike-shedding neither improves answer semantics nor addresses the OP's question about how to get at such data. — Todd A. Jacobs, Jun 29 '23 at 15:43
But fish doesn't actually export arrays, and it inherently can't because environment variables *are plain strings*. What fish does is converts arrays to a stringified version and export that, but the exported version is not an array anymore. If you access it from ruby, it'll be a string (which you could split into an array, but until/unless you split it, it's just a string). Even fish itself will see the env variable as a plain string. Subshells will see the array, because they get copies of non-exported data, but try `set -gx array one two three; fish -c 'count $array'`, and it'll print "1". — Gordon Davisson, Jun 29 '23 at 19:03