0

I have a project that needs to render a template. I want to ensure in advance that all variables in the template are not empty.

I was able to extract a template with a single row of data containing only one variable by a script like:

  • template file: tmpl_single.tmpl
{
  "var1": "${VARIABLE_1}",
  "var2": "${VARIABLE_2}",
  "var3": "${VARIABLE_3}"
}
  • shell script file: generate.sh
#!/bin/bash
DIR_BASE="$(cd "$(dirname "$0")" && pwd)"
render_2_file() {
  template_file=$1
  out_file=$2
  set +u
  for env in $(sed -n 's/^[^#].*${\(.*\)}.*/\1/p' $template_file); do
    # debug
    echo "$env : $(eval echo \$${env})"
    if [ -z "$(eval echo \$${env})" ]; then
      echo "environment variable '${env}' not set"
      missing=true
    fi
  done
  if [ "${missing}" ]; then
    echo 'Please check the above variable'
    exit 1
  fi
  eval "cat << EOF
$(cat ${template_file})
EOF" >"$out_file"
  set -u
}
main(){
    # debug generate 
    # VARIABLE_1=var1_val
    # VARIABLE_2=var1_val
    # VARIABLE_3=var1_val

    # single var in one line 
    TMPL_PATH=$DIR_BASE/tmpl_single.tmpl
    OUT_FILE=$DIR_BASE/tmpl_single.json
    # multi var in one line 
    # TMPL_PATH=$DIR_BASE/tmpl_multi.tmpl
    # OUT_FILE=$DIR_BASE/tmpl_multi.json
    echo "check path = $OUT_FILE"
    if [ ! -f "$OUT_FILE" ]; then
        echo "not found, generating"
        render_2_file "$TMPL_PATH" "$OUT_FILE"
        if [ $? = 0 ]; then
            echo "generate successfully"
        fi
    else
        echo "out file existed, no need to generate"
    fi
}
main "$@"

The output result is as follows, it can detect the case of one variable in a single line, at this time, the output is wrong and the target file is not generated

stdout:
not found, generating
VARIABLE_1 : 
environment variable 'VARIABLE_1' not set
VARIABLE_2 : 
environment variable 'VARIABLE_2' not set
VARIABLE_3 : 
environment variable 'VARIABLE_3' not set
Please check the above variable

However, if a single line in the template file contains multiple variables, only the last variable in the line can be extracted.

  • template file: tmpl_multi.tmpl
{
  "var12": "test1_${VARIABLE_1}:test2_${VARIABLE_2}",
  "var3": "test3_${VARIABLE_3}"
}
  • stdout:
not found, generating
VARIABLE_2 : 
environment variable 'VARIABLE_2' not set
VARIABLE_3 : 
environment variable 'VARIABLE_3' not set
Please check the above variable

It can be seen from the above output that variable VARIABLE_1 is not extracted.

Please tell me how to extract multiple variables wrapped by ${} in a row of data. Looking forward to your reply.

moluzhui
  • 1,003
  • 14
  • 34
  • 1
    `grep -o` can be used to print each match on its own line (instead of sed) – knittl Sep 02 '23 at 07:06
  • 3
    `eval` can be quite dangerous. Can you trust all your input files? – knittl Sep 02 '23 at 07:06
  • Do you also need to handle nested variables (with fallback or alternative values) or other types of expansion? e.g. `${var1:-default}`, `${var2:+-o "$var2"}`, `${var3:-$var4}`, `${path%/*}`, and similar? Or is the only supported format in your template file `${var}`? – knittl Sep 02 '23 at 07:08
  • @knittl Answers to your three questions: 1. My understanding is that `grep -o` can only determine lines, but cannot extract variables. 2. The input file is defined by myself and can be trusted. 3. There should be no other types. The above code is a sample code. It actually needs to parse other files, then get the actual value of the variable from it, and then render the template. During this period, I need to ensure that the variable is not empty. – moluzhui Sep 02 '23 at 07:18
  • `echo '1a2b' | grep -o '[0-9]'` outputs 1 and 2. You wrote "empty": I assume this includes set variables but with an empty value (so not only unset values?) – knittl Sep 02 '23 at 08:51
  • dont' do this with sed. It's fairly trivial in awk, in perl it is`perl -pe 's/\${(\w+)}/$ENV{$1}/g;' input` and I'm pretty sure there is a simple bash or printf hack to do this easily and robustly. – William Pursell Sep 02 '23 at 11:56
  • https://stackoverflow.com/questions/10683349/forcing-bash-to-expand-variables-in-a-string-loaded-from-a-file – William Pursell Sep 02 '23 at 12:01
  • It would be much safer to use `jq` -- which natively understands JSON and knows how to quote strings to be `eval`-safe -- instead of `sed`. And we have existing answered questions here that show how to use jq to expand variables in JSON while ensuring that the results are escaped back to valid JSON syntax. – Charles Duffy Sep 02 '23 at 19:53

3 Answers3

2

I would be cautious of using eval. Just imagine checking an input file with ${PATH+$(rm -rf /*)} or a variable that expands to something fishy.

Nevertheless, if you want to go that route, grep is perfect for extracting multiple matches from lines:

grep -io '\${[a-z0-9_]*}' | grep -o '[^${}]*' | sort -u

But since your input already seems to be valid JSON and the "rules" of what is interpreted as variable are limited, may I interest you in a bit of instead?

empty="$(jq -re 'map(scan("\\${(\\w+)}")[] | select(env[.]|length==0)) | unique[]' vars.json)" \
  && echo "The following vars are empty: $empty";

or

if empty="$(jq -re 'map(scan("\\${(\\w+)}")[] | select(env[.]|length==0)) | unique[]' vars.json)"; then
  echo "The following vars are empty: $empty";
fi

or

empty="$(jq -re 'map(scan("\\${(\\w+)}")[] | select(env[.]|length==0)) | unique[]' vars.json)";
if [ "$empty" ]; then
  echo "The following vars are empty: $empty";
fi
knittl
  • 246,190
  • 53
  • 318
  • 364
  • There a slight problem; if the variables are defined but not exported then `jq` won't be able to access them. – Fravadona Sep 02 '23 at 16:00
1

You can first use grep -oP to extract the varnames and save them into a bash array.
Then you check that they all exist in the environement.
Then you export them and call a program like envsubst to process the template.

note: You'll still have to make sure that the added content doesn't break your JSON.

#!/bin/bash

readarray -t varnames < <(
    LANG=C grep -oP '(?<=\$\{)[[:alpha:]_][[:alnum:]_]*(?=})' tmpl_single.tmpl |
    sort -u
)

for ref in "${varnames[@]}"
do
    [[ ${!ref:+1} ]] || {
        printf 'environment variable %s is not defined\n' "$ref" >&2
        exit 1
    }
done

(
    export "${varnames[@]}"
    envsubst < tmpl_single.tmpl
)
Fravadona
  • 13,917
  • 1
  • 23
  • 35
0

Using Bash + TXR:

$ ./check.sh tmpl_single.tmpl
variable VARIABLE_1 doesn't exist
variable VARIABLE_3 doesn't exist

Code in check.sh:

#!/bin/bash

VARIABLE_2=abc   # make VARIABLE_2 exist, for testing

eval $(txr -B -c \
'@(collect)
@(coll :vars (VAR))${@{VAR /[_a-zA-Z0-9]+/}}@(end)
@(end)
@(flatten VAR)' "$@")

for V in ${VAR[@]}; do
  if [ -z "${!V+x}" ]; then
    printf "variable %s doesn't exist\n" $V
  fi
done

Txr -B outputs variable bindings in shell assignment format, with proper escaping.List variables become array assignments like VAR[0]=VARIABLE_1, which gets us the variables in a Bash array.

An improvement would be to remove duplicates, since variables could occur in a template more than once.

Kaz
  • 55,781
  • 9
  • 100
  • 149