0

I've used jq for a few parsing tasks, just got started with it a couple of days ago and am enjoying its versatility, but the one task I've outlined below is a bit more complicated and I'm stuck wresting with jq over this one.

I have 2 files, both with similar schema for data. Files A and B, and what I'm hoping to get as output, looks like the following:

File A:

[
  {
    "type": "hive-site",
    "tag": 1507894175,
    "properties" : {
      "javax.jdo.option.ConnectionPassword" : "hortonworks1"
    }
  },
  {
    "type": "admin-properties",
    "tag": 1507894175,
    "properties": {
      "fieldA": "valueA",
      "fieldB": "valueB"
    }
  }
]

File B:

[
  {
    "type": "hive-site",
    "properties" : {
      "javax.jdo.option.ConnectionPassword" : "hortonworks2"
    }
  },
  {
    "type": "admin-properties",
    "properties": {
    "fieldA": "valueA",
    "fieldB": "valueB",
    "fieldC": "valueC"
    }
  },
  {
    "type": "other-type",
    "properties": {
      "newFieldA": "valueA",
      "newFieldB": "valueB"
    }
  }
]

Result: File C (File A as base, with modifications from File B)

[
  {
    "type": "hive-site",
    "tag": 1507894175,
    "properties" : {
      "javax.jdo.option.ConnectionPassword" : "hortonworks2"
    }
  },
  {
    "type": "admin-properties",
    "tag": 1507894175,
    "properties": {
      "fieldA": "valueA",
      "fieldB": "valueB",
      "fieldC": "valueC"
    }
  },
  {
    "type": "other-type",
    "tag": NEW,
    "properties": {
      "newFieldA": "valueA",
      "newFieldB": "valueB"
    }
  }
]

I'd like to take all of the pairs under "properties" from File B and push those to File A, updating existing property pairs if they exist, or adding them as their own block (with a "NEW" tag as shown) if they do not.

I've found similar answers (here and here), but none are close enough to be able to modify for my purposes.

Thank you!!

peak
  • 105,803
  • 17
  • 152
  • 177
orendain
  • 13
  • 3

3 Answers3

0

Here's a succinct and straightforward solution, based on the fact that in jq, if X and Y are two JSON objects, the expression X + Y gives precedence to keys in Y.

First, suppose the following is in combine.jq:

def lookup($t):
  $A[] | select(.type==$t) // {tag:"NEW"};

map( lookup(.type) + . )

The following invocation will then yield the desired result:

jq --argfile A A.json  -f combine.jq  B.json

One-liner

If you want a "one-liner":

jq --argfile A A.json 'map((.type as $t|$A[]|select(.type==$t)//{tag:"NEW"}) + .)' B.json
peak
  • 105,803
  • 17
  • 152
  • 177
  • The "NEW" tag shown in the sample output could be included with a minor change to `lookup` e.g. `def lookup($t): $A[] | select(.type==$t) // {tag:"NEW"};` – jq170727 Oct 13 '17 at 19:57
  • Hmm, I can't seem to get this to work. Here's what I'm trying: [Try it online](https://tio.run/##rVKxbsIwFNzzFU8WA6g4lKqINhVDQO1YdeuAGCxiiCH4ubEDrYBfb@qYQCNREEIdEuXuEt@9e5l95EMPYG0vAGK@FCcBkEwazMYxj@gY5URMSbPU2dTK7c5t9@Hxvt3tlLRKUfHUCK4JBOVZlta44BPBk6hgHVqyJOPE6Vt73zaPrWOx5FQLw6/znLEl@/RnEfqojEDpD1BKPi4e35jWK0x3aWJMDUoL57pdOl2bmEULIWklz8XBK7mda1gc5xzD30xO6R@UfiWMNwKmoRZ6Gxhe0OT/VXZ3XSN/jH5q8OOxq/zgwA/OJ0ET85Q6dLZ@yVcvJzewF08tYZNHfAIJ4jxT9ZppBJauhcMRbEDzxDZZ94sEvZ7VoNWCtf0zAvL6/E62T563YKq@/9i914Ab8KGR59@7feicUpklCRVSZcaClK0oZsaCHw) – orendain Oct 18 '17 at 01:59
  • @EdgarOrendain - What a mess! I'd suggest you try using jq at a command-line prompt as you have two JSON files. In any event, make sure your JSON really is JSON. Note that if your jq is ancient, you won't be able to use $ in the formal function parameters. – peak Oct 18 '17 at 02:16
  • @EdgarOrendain - I was able to get your example to work on tio (which uses jq 1.5) -- I used the -n option; added the data to the "code" section using def A: and def B:; and made other small modifications accordingly. – peak Oct 18 '17 at 02:33
  • But I'd suggest using jqplay.org if you want to use an online resource. – peak Oct 18 '17 at 02:40
0

Another jq solution (grouping by .type key):

jq --slurpfile f2 fileB '[$f2[0] + . | group_by(.type)[] 
     | if .[1] then .[1] + .[0] else .[0] end]' fileA

The output:

[
  {
    "type": "admin-properties",
    "tag": 1507894175,
    "properties": {
      "fieldA": "valueA",
      "fieldB": "valueB",
      "fieldC": "valueC"
    }
  },
  {
    "type": "hive-site",
    "tag": 1507894175,
    "properties": {
      "javax.jdo.option.ConnectionPassword": "hortonworks2"
    }
  },
  {
    "type": "other-type",
    "properties": {
      "newFieldA": "valueA",
      "newFieldB": "valueB"
    }
  }
]
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
  • Thanks for the help! I found that this solution would replace all fileA properties with the ones from fileB, rather than merging the two :/ – orendain Oct 18 '17 at 01:59
0

Here is a solution using reduce to create a temporary lookup table from File A along with a map to produce the requested output including the "NEW" tag on items in File B not present in File A.

  (reduce $A[] as $a({};.[$a.type]=$a)) as $t
| map(if $t[.type]==null then {tag:"NEW"}+. else $t[.type]*. end)

Try it online!

This can be made more concise by eliminating the if in the map and using the // Alternative operator instead. e.g.

  (reduce $A[] as $a({};.[$a.type]=$a)) as $t | map( ($t[.type]//{tag:"NEW"})*. )

Try it online!

Sample Run (assumes filter in filter.jq and data in FileA.json and FileB.json):

$ jq -M -f filter.jq --argfile A FileA.json FileB.json
[
  {
    "type": "hive-site",
    "tag": 1507894175,
    "properties": {
      "javax.jdo.option.ConnectionPassword": "hortonworks2"
    }
  },
  {
    "type": "admin-properties",
    "tag": 1507894175,
    "properties": {
      "fieldA": "valueA",
      "fieldB": "valueB",
      "fieldC": "valueC"
    }
  },
  {
    "tag": "NEW",
    "type": "other-type",
    "properties": {
      "newFieldA": "valueA",
      "newFieldB": "valueB"
    }
  }
]

The following filter will also include any keys in File A not in File B (as requested in comments).

  (reduce $A[] as $a({};.[$a.type]=$a)) as $t  # build lookup table
| map( ($t[.type]//{tag:"NEW"})*. )            # apply A to B
| (($t|keys_unsorted)-map(.type)) as $o        # find keys in A not in B
| [$t[$o[]]] + .                               # add back those objects

Try it online!

jq170727
  • 13,159
  • 3
  • 46
  • 56
  • Heya, this solution seems very promising, only I'd also like to keep the blocks in file A not updated by file B, rather than throwing them out. I edited the link you posted - any followup would be appreciated! (Edit: Apologies, comment too long, updated link in the next comment) – orendain Oct 18 '17 at 02:02
  • [Try it online](https://tio.run/##rZAxT8MwEIX3/IqTlSGFOqKIqlDUIUWsiI0h6mAl18YhtUNst6A2f53gOFFbxFJVDLbu3bPvfbr8o4k9gJ09AER/lUimQIzQ0iQZpjSRYslXZNj7bGXt0fhmcv9wN5qM@3ZZyRIrzVERmPazbFvJNS45FmnbdWrDCoPE@bW96@Hf6IxvkCqu8bLMnG3YZ5inMpSl5lKET1IITNrylSm1lVVHk8lKS2Hluxr1SZcSs3TNBT3hORv8hNulRu04lxgdmZwzPzjzExhvAUyBH3l7iM/Y5P@t7PYXxL4BCCpMTYIWJu6gWLCrH8PYZ2HLs5j5bDBwhra0a1YGfGnruHdnwhQF6AwF7OzepuTl@Y3U1yFgofD47so2RDpomu8OVTWUtj8pF6XRVlRsS6XRVvwA) – orendain Oct 18 '17 at 02:03
  • This version preserves all of A: [Try it online!](https://tio.run/##rZBBT8IwFMfvfIqXucOmbIjRqBgPYLwabx6WxZS1SGH01bUFCfDVnV03ZcbEEGOTLX39t7/3y5u9lkkHYGM/AE@vJfMG4Bmh0WRTRqMMxYS/eN0mJy827l@cXl5dn/cvL5pjWaBkheZMeTBoWPZY4YJNOMtpdeqqJckN81y@s/9d92frKV@ySHHN/tZzRpbkLZ5RjFFqjiK@QyFYVm0fiVIrLGqbKRYahS3nqt90@qsxoQsuopbPweItb9d1WOFcx@HeySWjr2TUkumkQBT4w84WkgMm@X8jO/smsS1rUlAwajJmhZJajASb3U2c@CSunNJbn4ShCzTAEYwNzynkiHMjQZNxzhx0CwsiAwh8ndTPer2NHeXAe7h/8nbhcQwhtNYRECnzNQxBI4waQmBfb@dsrZ6NUFab0TCqqA7YOOCeMOGCQnUduLAggbrafMISa@JjkqYpnEAMvy@rQymMSTYHPUXFHLYs3@vRqjKKhMnziAtptC0KsorQaFt8AA "jq – Try It Online") – jq170727 Oct 18 '17 at 05:34