2

I have the following JSON data:

{
  "Jobs":[
     {"JobId": 111, "ArchiveId": 333},
     {"JobId": 112, "ArchiveId": 333},
     {"JobId": 113, "ArchiveId": 2323},
     {"JobId": 114, "ArchiveId": 444}
  ]
}

And here's the shell script that looks at the JSON object:

count_again=0
jq -r '.Jobs |= unique_by(.ArchiveId)' my-json-archiving.json  \
   | while IFS= read -r job; do
   count_again=$(($count_again + 1))
   echo $job
   echo $count_again
done

My first step worked by filtering any duplicates by a certain key (.ArchiveId). Once that's done I want to loop through the result. Below is what I have, the main issue with that is it actually reads through line by line. I think it's got to do with the $job that I return.

I'm very new to shell scripting so I'm not certain on how to return the object that it's looping through when reading the object

kimbo
  • 2,513
  • 1
  • 15
  • 24
gdubs
  • 2,724
  • 9
  • 55
  • 102

1 Answers1

2

Edit #2

If you're going to be doing more than just a few simple things with the JobIds and the ArchiveIds, you might consider doing this in Python:

import json

with open('my-json-archiving.json', 'r') as fp:
    jobs = json.load(fp)['Jobs']

seen = set()
unique_by_archive_id = [job for job in jobs if job['ArchiveId'] not in seen and not seen.add(job['ArchiveId'])]

for job in unique_by_archive_id:
    job_id = job['JobId']
    archive_id  = job['ArchiveId']
    # do stuff here

Edit #1

To get JobId and ArchiveId as variables, you could do something like this:

jq -r '.Jobs |= unique_by(.ArchiveId) | .Jobs[] | "\(.JobId) \(.ArchiveId)"' \
 my-json-archiving.json | while IFS= read -r line; do
        jobId="$(awk '{print $1}' <<< $line)"
        archiveId="$(awk '{print $2}' <<< $line)"
        echo "Job id: $jobId"
        echo "Archive id: $archiveId"
done

Original answer

I'm not 100% certain what you're asking here. If you want to just get the JobId and the ArchiveId from each job, you could do something like this:

$ jq -r '.Jobs |= unique_by(.ArchiveId) | .Jobs[] | "\(.JobId) \(.ArchiveId)"' \
 my-json-archiving.json
111 333
114 444
113 2323

Text like this works very well with awk. For example:

$ jq -r '.Jobs |= unique_by(.ArchiveId) | .Jobs[] | "\(.JobId) \(.ArchiveId)"' \
 my-json-archiving.json | awk '{print "JobId:", $1, "ArchiveId:", $2}'
JobId: 111 ArchiveId: 333
JobId: 114 ArchiveId: 444
JobId: 113 ArchiveId: 2323

Similar question: Using jq to extract specific property values and output on a single line.

Also could take a look at jq's manpage (man jq). Lots of examples there.

kimbo
  • 2,513
  • 1
  • 15
  • 24
  • Hi! Thanks for the response. I basically want to loop through the results and do a few more commands (using jobid and archiveid) without having to write (or output) first – gdubs Mar 03 '20 at 04:16
  • Edited my answer. – kimbo Mar 03 '20 at 04:21
  • thank you! the first one worked. unfortunately im confined at the cli atm, i dont have time to experiment with python as i basically need to execute multiple aws commands per jobid where i need to get the results. https://stackoverflow.com/questions/60501032/looping-through-a-list-of-archive-id-with-cli-aws-describe-job-returns-null-on-o – gdubs Mar 03 '20 at 06:17