3

disclaimer: indeed, there are already different answers (like JQ Join JSON files by key or denormalizing JSON with jq) for but none of them helped me yet or did have different circumstances I was unable to derive a solution from ;/


I have 2 files, both are lists of objects where one of them ha field references to object ids of the other one

given

[
  {
    "id": "5b9f50ccdcdf200283f29052",
    "reference": {
      "id": "5de82d5072f4a72ad5d5dcc1"
    }
  }
]

and

[
  {
    "id": "5de82d5072f4a72ad5d5dcc1",
    "name": "FooBar"
  }
]

my goal would be to get a denormalized object list:

expected

[
  {
    "id": "5b9f50ccdcdf200283f29052",
    "reference": {
      "id": "5de82d5072f4a72ad5d5dcc1",
      "name": "FooBar"
    }
  }
]

while I'm able to do the main parts, I didn't challenged to bring both together yet:

with example 1

jq -s '(.[1][] | select(.id == "5de82d5072f4a72ad5d5dcc1"))' objects.json referredObjects.json

I get

{
  "id": "5de82d5072f4a72ad5d5dcc1",
  "name": "FooBar"
}

and with example 2

jq -s '.[0][] | .reference = {}' objects.json referredObjects.json

I can manipulate any .reference getting

{
  "id": "5b9f50ccdcdf200283f29052",
  "reference": {}
}

(even I loose the list structure)

But: I can't do s.th. like

execpted "join"

jq -s '.[0][] as $obj | $obj.reference = (.[1][] | select(.id == $obj.reference.id))' objects.json referredObjects.json

even approaches with foreach or reduce looks promising

jq -s '[foreach .[0][] as $obj ({}; .reference.id = ""; . + $obj )]' objects.json referredObjects.json

=>

[
  {
    "reference": {
      "id": "5de82d5072f4a72ad5d5dcc1"
    },
    "id": "5b9f50ccdcdf200283f29052"
  }
]

where I expected to get the same as in second example

I end up in headaches and looking forward to write a ineffective while routine in any language ... hopefully I would appreciate any help on this

~Marcel

childno͡.de
  • 4,679
  • 4
  • 31
  • 57

2 Answers2

3

Transform the second file into an object where ids and names are paired and use it as a reference while updating the first file.

$ jq '(map({(.id): .}) | add) as $idx
      | input
      | map_values(.reference = $idx[.reference.id])' file2 file1
[
  {
    "id": "5b9f50ccdcdf200283f29052",
    "reference": {
      "id": "5de82d5072f4a72ad5d5dcc1",
      "name": "FooBar"
    }
  }
]
childno͡.de
  • 4,679
  • 4
  • 31
  • 57
oguz ismail
  • 1
  • 16
  • 47
  • 69
  • 1
    THANK YOU VERY MUCH @Oguz-Ismail, I just simplified it once to just replace the complete .reference object if there are more fields than .name – childno͡.de Feb 26 '20 at 15:16
2

The following solution uses the same strategy as used in the solution by @OguzIsmail but uses the built-in function INDEX/2 to construct the dictionary from the second file.

The important point is that this strategy allows the arrays in both files to be of arbitrary size.

Invocation

jq --argfile file2 file2.json -f program.jq file1.json

program.jq

INDEX($file2[]; .id) as $dict
| map(.reference.id as $id | .reference = $dict[$id])
childno͡.de
  • 4,679
  • 4
  • 31
  • 57
peak
  • 105,803
  • 17
  • 152
  • 177
  • THANK YOU VERY MUCH @peak, I just simplified it once to just replace the complete .reference object if there are more fields than .name In additon: what are the known restrictions of not using `--argfile`? From documentation I expect that this would lead to more memory consumption reading all in at once prior processing + Documentation says to `--argfile`: `Do not use. Use --slurpfile instead.` but it is not 1:1 replacable, isn't it?! – childno͡.de Feb 26 '20 at 15:21
  • The semantics of the two command-line options is different. If you are uncomfortable using --argfile, by all means use --slurpfile (with the needed minor adjustment to the jq filter), or you could also use the technique in @OguzIsmail's answer. – peak Feb 26 '20 at 15:29
  • 1
    I'm not just uncomfortable with, but I don't understand the difference and why --argfile is more robust, just asking ;) – childno͡.de Feb 26 '20 at 19:45