0

I am working on a bash script which needs to ingest some data from a json (formatted with jq) and do some simple work with it, however in the process of doing so I noticed elements are getting broken down and even though I am ingesting elements containing a whitespace enclosed in double-quotes, elements are processed incorrectly.

Here's an exemplary json I will be processing

{
  "start": 1689652086,
  "finish": 1679652100,
  "contestants": [
    {
      "name": "Joan-Juan Frank",
      "comment": "I reek of havoc."
    },
    {
      "name": "Kimi-Kinder Karten",
      "output": "I love chocolate."
    },
    {
      "name": "Peter-Parker Plays Piano",
      "output": "And I am tired of it"
    }
  ]
}

I composed this command in bash to get this formatted string out of the json - I am basically selecting each value from name property of each element in the json array

cat json | jq '.contestants[].name' | tr '\n' ' '

this results in this basic format which should have worked as an array declaration

"Joan-Juan Frank" "Kimi-Kinder Karten" "Peter-Parker Plays Piano" 

I tried each of these two version of creating an array (none of them worked)

contestants=$(cat json | jq '.contestants[].name' | tr '\n' ' ')
array=($contestants)

or

array=($(cat json | jq '.contestants[].name' | tr '\n' ' '))

Lastly, I used this for loop to output each element

for i in "${array[@]}"; do
    echo "Working on: $i"
done

output of the script is

Working on: "Joan-Juan
Working on: Frank"
Working on: "Kimi-Kinder
Working on: Karten"
Working on: "Peter-Parker
Working on: Plays
Working on: Piano"

I also went back and found out that the issue seems to be somewhere at the level of the array declaration. When I echoed the variable before declaring an array and then echoed elements of the array, here's what I got:

echo of $contestants - "Joan-Juan Frank" "Kimi-Kinder Karten" "Peter-Parker Plays Piano"

echo of $array

echo "${array[0]}"
"Joan-Juan
echo "${array[1]}"
Frank"
echo "${array[2]}"
"Kimi-Kinder
echo "${array[3]}"
Karten"
...

I have then tried to declare the array without any variable

array=("Joan-Juan Frank" "Kimi-Kinder Karten" "Peter-Parker Plays Piano" )

and this was processed correctly, as I would expect

Working on: Joan-Juan Frank
Working on: Kimi-Kinder Karten
Working on: Peter-Parker Plays Piano

I was trying suggestions from other threads like this one, this one or this one, but in my case, I wasn't able to make this work when I am ingesting content from a variable, instead of just declaring a static array. Although, it is possible I might have made a mistake along the way.

Do you have any idea how I could fix this?

1 Answers1

1

You could declare the Bash array using the -a option, and have jq escape its output using the @sh builtin:

unset array
declare -a array="($(jq -r '.contestants[].name | @sh' json))"

This should have the same effect as

unset array
array=("Joan-Juan Frank" "Kimi-Kinder Karten" "Peter-Parker Plays Piano")
pmf
  • 24,478
  • 2
  • 22
  • 31
  • This was the missing piece, thank you! I forgot to mention I originally used `declare -a array=($contestants)` format and my mistake was that I should have placed `($contestants)` into double-quotes `"($contestants)"`. – rollingovermycode Jan 11 '23 at 20:07
  • @rollingovermycode Note that the `@sh` is critical for security; the proposal above without it (just directly interpreting JSON strings as shell syntax) could let someone providing your input run arbitrary commands – Charles Duffy Jan 11 '23 at 20:27