0

As I am teaching myself Bash programming, I came across an interesting use case, where I want to take a list of variables that exist in the environment, and put them into an array. Then, I want to output a list of the variable names and their values, and store that output in an array, one entry per variable.

I'm only about 2 weeks into Bash shell scripting in any "real" way, and I am educating myself on arrays. A common function in other programming language is the ability to "zip" two arrays, e.g. as is done in Python. Another common feature in any programming language is indirection, e.g. via pointer indirection, etc. This is largely academic, to teach myself through a somewhat challenging example, but I think this has widespread use if for no other reason than debugging, keeping track of overall system state, etc.

What I want is for the following input... :

VAR_ONE="LIGHT RED"
VAR_TWO="DARK GREEN"
VAR_THREE="BLUE"
VARIABLE_ARRAY=(VAR_ONE VAR_TWO VAR_THREE)

... to be converted into the following output (as an array, one element per line):

VAR_ONE: LIGHT RED
VAR_TWO: DARK GREEN
VAR_THREE: BLUE

Constraints:

  • Assume that I do not have control of all of the variables, so I cannot just sidestep the problem e.g. by using an associative array from the get-go. (i.e. please do not recommend avoiding the need for indirect reference lookups altogether by never having a discrete variable named "VAR_ONE"). But a solution that stores the result in an associative array is fine.
  • Assume that variable names will never contain spaces, but their values might.
  • The final output should not contain separate elements just because the input variables had values containing spaces.

What I've read about so far:

  • I've read some StackOverflow posts like this one, that deal with using indirect references to arrays themselves (e.g. if you have three arrays and want to choose which one to pull from based on an "array choice" variable): How to iterate over an array using indirect reference?
  • I've also found one single post that deals with "zipping" arrays in Bash in the manner I'm talking about, where you pair-up e.g. the 1st element from array1 and array2, then pair up the 2nd elements, etc.: Iterate over two arrays simultaneously in bash
  • ...but I haven't found anything that quite discusses this unique use-case...

QUESTION:

How should I make an array containing a list of variable names and their values (colon-separated), given an array containing a list of variable names only. I'm not "failing to come up with any way to do it" but I want to find the "preferred" way to do this in Bash, considering performance, security, and being concise/understandable.

EDIT: I'll post what I've come up with thus far as an answer to this post... but not mark it as answered, since I want to also hear some unbiased recommendations...

Sean
  • 393
  • 2
  • 11
  • 1
    if you don't mind using the variable names as the array index, consider using an associative array (`declare -A arrayname`); as for the rest of your question(s) ... I'd suggest you focus on a single issue that you're having and then either a) update the question, focusing on that one issue or b) close/delete this question and open a new more-focused question; this question will likely get flagged for closure because a) there's no focus on one particular issue and b) you appear to be asking for opinions – markp-fuso Apr 09 '21 at 17:31
  • 1
    I don't understand why you exclude associative arrays because you do not "have control of all of the variables" – that shouldn't matter? And I agree with @markp-fuso, this seems to be many questions at once. – Benjamin W. Apr 09 '21 at 17:38
  • 1
    I'll admit I stopped reading the question half way thru (no focus, rambling) so missed OP's comment about associative arrays; the missive (?) about using associative arrays due to spaces in the values doesn't make sense ... the associative array index and values may contain white space ... ? not sure how OP plans to 'zip' a list of variables if all the names are not known ... ?? – markp-fuso Apr 09 '21 at 17:44
  • @markp-fuso : Edited the question to put my own ideas as an answer instead of in the main post. Regarding a single issue, it is: make an array containing a list of variable names and their values (colon-separated), given an array containing a list of variable names only. I'll admit that I'm not "failing to come up with any way to do it" but I want to find the "preferred" way to do this in Bash, considering performance, security, and being concise/understandable. I'll note this in the original post. – Sean Apr 10 '21 at 01:01
  • @BenjaminW. If you have an idea using associative arrays, please do share. What I meant is that I've read some posts saying to "avoid setting yourself up for the need to use indirect references altogether" by storing name/value pairs (e.g. [VAR_1] = LIGHT RED) in an associative array "from the start". Assume that I'm using predefined system environment variables. But if you have a way to collect the values from those variables and put them *into* an associative array, that would be a good solution. Would that make the final "printout" step a little more verbose, though? – Sean Apr 10 '21 at 01:04

2 Answers2

1

OP starts with:

VAR_ONE="LIGHT RED"
VAR_TWO="DARK GREEN"
VAR_THREE="BLUE"
VARIABLE_ARRAY=(VAR_ONE VAR_TWO VAR_THREE)

OP has provided an answer with 4 sets of code:

# first 3 sets of code generate:

$ typeset -p outputValues
declare -a outputValues=([0]="VAR_ONE: LIGHT RED" [1]="VAR_TWO: DARK GREEN" [2]="VAR_THREE: BLUE")

# the 4th set of code generates the following where the data values are truncated at the first space:

$ typeset -p outputValues
declare -a outputValues=([0]="VAR_ONE: LIGHT" [1]="VAR_TWO: DARK" [2]="VAR_THREE: BLUE")

NOTES:

  • I'm assuming the output from the 4th set of code is wrong so will be ignoring this one
  • OP's code samples touch on a couple ideas I'm going to make use of (below) ... nameref's (declare -n <variable_name>) and indirect variable references (${!<variable_name>})


For readability (and maintainability by others) I'd probably avoid the various eval and expansion ideas and instead opt for using bash namesref's (declare -n); a quick example:

$ x=5
$ echo "${x}"
5

$ y=x
$ echo "${y}"
x

$ declare -n y="x"          # nameref => y=(value of x)
$ echo "${y}"
5

Pulling this into the original issue we get:

unset outputValues
declare -a outputValues                           # optional; declare 'normal' array

for var_name in "${VARIABLE_ARRAY[@]}"
do
    declare -n data_value="${var_name}"
    outputValues+=("${var_name}: ${data_value}")
done

Which gives us:

$ typeset -p outputValues
declare -a outputValues=([0]="VAR_ONE: LIGHT RED" [1]="VAR_TWO: DARK GREEN" [2]="VAR_THREE: BLUE")

While this generates the same results (as OP's first 3 sets of code) there is (for me) the nagging question of how is this new array going to be used?

If the sole objective is to print this pre-formatted data to stdout ... ok, though why bother with a new array when the same can be done with the current array and nameref's?

If the objective is to access this array as sets of variable name/value pairs for processing purposes, then the current structure is going to be hard(er) to work with, eg, each array 'value' will need to be parsed/split based on the delimiter :<space> in order to access the actual variable names and values.

In this scenario I'd opt for using an associative array, eg:

unset outputValues
declare -A outputValues                           # required; declare associative array

for var_name in "${VARIABLE_ARRAY[@]}"
do
    declare -n data_value="${var_name}"
    outputValues[${var_name}]="${data_value}"
done

Which gives us:

$ typeset -p outputValues
declare -A outputValues=([VAR_ONE]="LIGHT RED" [VAR_THREE]="BLUE" [VAR_TWO]="DARK GREEN" )

NOTES:

  • again, why bother with a new array when the same can be done with the current array and nameref's?
  • if the variable $data_value is to be re-used in follow-on code as a 'normal' variable it will be necessary to remove the nameref attribute (unset -n data_value)

With an associative array (index=variable name / array element=variable value) it becomes easier to reference the variable name/value pairs, eg:

$ myvar=VAR_ONE
$ echo "${myvar}: ${outputValues[${myvar}]}"
VAR_ONE: LIGHT RED

$ for var_name in "${!outputValues[@]}"; do echo "${var_name}: ${outputValues[${var_name}]}"; done
VAR_ONE: LIGHT RED
VAR_THREE: BLUE
VAR_TWO: DARK GREEN


In older versions of bash (before nameref's were available), and still available in newer versions of bash, there's the option of using indirect variable references;

$ x=5
$ echo "${x}"
5

$ unset -n y        # make sure 'y' has not been previously defined as a nameref
$ y=x
$ echo "${y}"
x

$ echo "${!y}"
5

Pulling this into the associative array approach:

unset -n var_name                                 # make sure var_name not previously defined as a nameref
unset outputValues
declare -A outputValues                           # required; declare associative array

for var_name in "${VARIABLE_ARRAY[@]}"
do
    outputValues[${var_name}]="${!var_name}"
done

Which gives us:

$ typeset -p outputValues
declare -A outputValues=([VAR_ONE]="LIGHT RED" [VAR_THREE]="BLUE" [VAR_TWO]="DARK GREEN" )

NOTE: While this requires less coding in the for loop, if you forget to unset -n the variable (var_name in this case) then you'll end up with the wrong results if var_name was previously defined as a nameref; perhaps a minor issue but it requires the coder to know of, and code for, this particular issue ... a bit too esoteric (for my taste) so I prefer to stick with namerefs ... ymmv ...

markp-fuso
  • 28,790
  • 4
  • 16
  • 36
  • Thanks @markp-fuso, I particularly like using the original array when you're printing-out values in your associative array. Otherwise, the ideas I was coming up with made that part kind of verbose. I don't fully follow "again, why bother with a new array when the same can be done with the current array and nameref's?" My idea for that was then I can do whatever I want with the output - print it, format it, use it for later assignments, etc. Although I agree that the colon symbols in my version makes that harder than your associative array method! – Sean Apr 10 '21 at 20:44
  • 1
    And yes, in the 4th set of code, I guess I had set `IFS=""` earlier on my machine. Running it from a fresh shell, I agree with your findings, it is not fully-correct as I listed the code in this post. I'll update it in a bit. – Sean Apr 10 '21 at 20:50
0

I've come up with a handful of possible solutions in the last couple days, each one with their own pro's and con's. I won't mark this as the answer for awhile though, since I'm interested in hearing unbiased recommendations.


My brainstorming solutions thus far:

OPTION #1 - FOR-LOOP:

alias PrintCommandValues='unset outputValues
for var in ${VARIABLE_ARRAY[@]}
do outputValues+=("${var}: ${!var}")
done; printf "%s\n\n" "${outputValues[@]}"'
PrintCommandValues

Pro's: Traditional, easy to understand

Cons: A little verbose. I'm not sure about Bash, but I've been doing a lot of Mathematica programming (imperative-style), where such loops are notably slower. Anybody know if that's true for Bash?

OPTION #2 - EVAL:

i=0; outputValues=("${VARIABLE_ARRAY[@]}")
eval declare "${VARIABLE_ARRAY[@]/#/outputValues[i++]+=:\\ $}"
printf "%s\n\n" "${outputValues[@]}"

Pros: Shorter than the for-loop, and still easy to understand.

Cons: I'm no expert, but I've read a lot of warnings to avoid eval whenever possible, due to security issues. Probably not something I'll concern myself a ton over when I'm mostly writing scripts for "handy utility purposes" for my personal machine only, but...

OPTION #3 - QUOTED DECLARE WITH PARENTHESIS:

i=0; declare -a outputValues="(${VARIABLE_ARRAY[@]/%/'\:\ "${!VARIABLE_ARRAY[i++]}"'})"
printf "%s\n\n" "${outputValues[@]}"

Pros: Super-concise. I just plain stumbled onto this syntax -- I haven't found it mentioned anywhere on the web. Apparently, using declare in Bash (I use version 4.4.20(1)), if (and ONLY if) you place array-style (...) brackets after the equals-sign, and quote it, you get one more "round" of expansion/dereferencing, similar to eval. I happened to be toying with this post, and found the part about the "extra expansion" by accident.

For example, compare these two tests:

varName=varOne; varOne=something
declare test1=\$$varName
declare -a test2="(\$$varName)"
declare -p test1 test2

Output:

declare -- test1="\$varOne"
declare -a test2=([0]="something")

Pretty neat, I think...

Anyways, the cons for this method are... I've never seen it documented officially or unofficially anywhere, so... portability...?

Alternative for this option:

i=0; declare -a LABELED_VARIABLE_ARRAY="(${VARIABLE_ARRAY[@]/%/'\:\ \$"${VARIABLE_ARRAY[i++]}"'})"
declare -a outputValues=("${LABELED_VARIABLE_ARRAY[@]@P}")
printf "%s\n\n" "${outputValues[@]}"

JUST FOR FUN - BRACE EXPANSION:

unset outputValues; OLDIFS=$IFS; IFS=; i=0; j=0
declare -n nameCursor=outputValues[i++]; declare -n valueCursor=outputValues[j++]
declare {nameCursor+=,valueCursor+=": "$}{VAR_ONE,VAR_TWO,VAR_THREE}
printf "%s\n\n" "${outputValues[@]}"
IFS=$OLDIFS

Pros: ??? Maybe speed?

Cons: Pretty verbose, not very easy to understand


Anyways, those are all of my methods... Are any of them reasonable, or would you do something different altogether?

Sean
  • 393
  • 2
  • 11