0

I have two different json API responses, which contain the information required for further processing, my requirement is to read these json responses, pull the required information, and create a new single file with the below provided format.

first output:

DETAILS=$(echo $RESPONSE | jq  ".results[] | select( .name | index(\"$CLUSTER\"))" | jq -r '[.name, .regionName, .diskSizeGB, .instanceSizeName]')

echo $DETAILS

[
  "cluster_name",
  "region",
  100,
  "R50"
]

second output:

LATEST_SNAP_ID=$(echo $SNAP_RESPONSE | jq -c ".results[] | select( .createdAt | contains(\"$time_stamp\"))" | jq -r '[.id, .createdAt]')


echo $LATEST_SNAP_ID
[
  "1234567890987654321",
  "2022-12-20T23:01:56Z"
]

I tried various options with jq, but no luck.

expected output would be:

Note: the response contains the values for all the clusters, and we need to pull only the required values for them.

echo $SNAP_RESPONSE | jq '.results[0, 1]'

{
  "cloudProvider": "AWS",
  "copyRegions": [],
  "createdAt": "2022-12-20T23:01:56Z",
  "expiresAt": "2022-12-27T23:03:50Z",
  "frequencyType": "daily",
  "id": "1234567890987654321",
  "links": [
    {
      "href": "url1",
      "rel": "self"
    },
    {
      "href": "url2",
      "rel": "url3"
    }
  ],
  "mongodVersion": "1.1.1",
  "policyItems": [
    "12345465672342"
  ],
  "replicaSetName": "cluster_name_1",
  "snapshotType": "scheduled",
  "status": "completed",
  "storageSizeBytes": 23141234,
  "type": "replicaSet"
}
{
  "cloudProvider": "AWS",
  "copyRegions": [],
  "createdAt": "2022-12-19T23:03:08Z",
  "expiresAt": "2022-12-26T23:05:02Z",
  "frequencyType": "daily",
  "id": "1234567890987654322",
  "links": [
    {
      "href": "url1",
      "rel": "self"
    },
    {
      "href": "url2",
      "rel": "url3"
    }
  ],
  "mongodVersion": "1.1.1",
  "policyItems": [
    "12345465672342"
  ],
  "replicaSetName": "cluster_name_2",
  "snapshotType": "scheduled",
  "status": "completed",
  "storageSizeBytes": 32547137,
  "type": "replicaSet"
}

Since the $RESPONSE is too big to paste here, selected the required keys.

echo $RESPONSE | jq '.results[0, 1]' | jq '[.name, regionName, .diskSizeGB, instanceSizeName]'

[
  "cluster_name_1",
  "region",
  10,
  "M20"
]
[
  "Cluster_name_2",
  "region",
  160,
  "R50"
]

Please assist.

L_sama
  • 23
  • 3
  • It looks like the task could most easily and efficiently be accomplished with one just one invocation of jq, but you have not shown the value of $RESPONSE or $SNAP_RESPONSE. Could you perhaps provide a sufficiently representative illustration of what JSON these two variables hold? – peak Dec 21 '22 at 20:47
  • Note that `echo $RESPONSE` is not the same as `echo "$RESPONSE"`. For an example that shows how they can differ, try setting `RESPONSE='{"message": " * HELLO * WORLD * "}'` -- you'll see the whitespace-surrounded `*`s replaced with lists of filenames. That's not the _only_ bug unquoted uses of echo have, just one of the most dramatic; see also [I just assigned a variable, but `echo $variable` shows something different!](https://stackoverflow.com/questions/29378566), and [Why is printf better than echo?](https://unix.stackexchange.com/questions/65803) – Charles Duffy Dec 21 '22 at 21:02
  • @peak added the sample $SNAP_RESPONSE array to the original comment. – L_sama Dec 21 '22 at 21:02
  • Also, if you need to pass a variable to jq, much better to use `jq --arg variable "$variable" '...$variable...'` than `jq "...$variable..."`; the latter lends itself to injection attacks, or just bugs when your values result in things that aren't legal code when parsed as syntax. (Or `--argjson` if the variable data is JSON) – Charles Duffy Dec 21 '22 at 21:05
  • @L_sidd - What about $RESPONSE ? It seems one needs both, no? Please review the [mcve] guidelines. – peak Dec 21 '22 at 21:08

1 Answers1

2

Your question only includes information for one cluster, so the following only shows how to handle the data for DETAILS and LATEST_SNAP_ID as provided in the question:

jq -n --argjson r1 "$DETAILS" --argjson r2 "$LATEST_SNAP_ID" '
{
    "cluster_configuration": {
      ($r1[0]): {
        "db_size": $r1[2],
        "size": $r1[3],
        "id": $r2[0],
        "created_at": $r2[1]
      }
   }
}
' 

This produces:

{
  "cluster_configuration": {
    "cluster_name": {
      "db_size": 100,
      "size": "R50",
      "id": "1234567890987654321",
      "created_at": "2022-12-20T23:01:56Z"
    }
  }
}

Addendum - Multiple clusters

Assuming r1 is the stream of objects and r2 is the stream of arrays: you could combine them using:

r2 | jq -n --slurpfile r2 <(r1) '
  def cluster($r1; $r2):
    { ($r1[0]): {
        "db_size": $r1[2],
        "size": $r1[3],
        "id": $r2.id,
        "created_at": $r2.createdAt
      }
   } ;

However, as stated in a comment, it would be better if you combined all the calls to jq into one single call to jq.

peak
  • 105,803
  • 17
  • 152
  • 177
  • how can I iterate over `$RESPONSE` and `$SNAP_RESPONSE` which contain the information for multiple clusters and create a single JSON file? Adding the response output as well to the original question. – L_sama Dec 21 '22 at 21:47