1

I have the following working jq transform. Input file (input.jsonl):

{"key": "key1", "value": {"one": 1, "two": 2}}
{"key": "key2", "value": {"three": 3, "four": 4}}

jq transform:

$ jq --compact-output '.key as $key|.value|to_entries|map({key: ($key), member:.key, score:(.value|tostring)})|.[]' input.jsonl

which correctly produces the desired output:

{"key":"key1","member":"one","score":"1"}
{"key":"key1","member":"two","score":"2"}
{"key":"key2","member":"three","score":"3"}
{"key":"key2","member":"four","score":"4"}

The input json is quite large - imagine thousands of entries in the "values" field of above example. I wish to perform this exact transformation in jq stream mode with the goal of avoiding memory pressure.

I have tried using jq foreach to no avail. I cannot find a way to store the "key1" value to be referenced as entries in "values" are processed.

Example, using the same input as the working example:

$ jq -c --stream 'foreach . as $input ({};{in: $input};.)' input.jsonl

{"in":[["key"],"key1"]}
{"in":[["value","one"],1]}
{"in":[["value","two"],2]}
{"in":[["value","two"]]}
{"in":[["value"]]}
{"in":[["key"],"key2"]}
{"in":[["value","three"],3]}
{"in":[["value","four"],4]}
{"in":[["value","four"]]}
{"in":[["value"]]}

I need to reference the value "key1" when processing lines 2 and 3 above and so on for the remaining keys.

To reiterate, I desire the exact output from the non-stream version.

peak
  • 105,803
  • 17
  • 152
  • 177
Bill Crook
  • 36
  • 1
  • 4

2 Answers2

3

foreach is unnecessary for this case.

{key: .[1]}
+ ( inputs
    | select(length == 2)
    | {member: .[0][1], score: .[1]}
  )

Note: This answers the initial version of OP.

oguz ismail
  • 1
  • 16
  • 47
  • 69
  • This works on a single json document, however, I actually needed it to handle a multi-line jsonl file. I updated the question to be more specific. Thanks. – Bill Crook May 09 '20 at 00:25
  • @Bill you should have mentioned that at the beginning. Since you've already got a working answer I'm not updating this – oguz ismail May 09 '20 at 04:30
1

Here's a solution using --stream and foreach that can be used for a stream of JSON objects of the type described. Note that it assumes that "key" appears before "value" in each of the top-level objects.

echo '{"key": "key1", "value": {"one": 1, "two": 2}}' |
    jq -n --stream -c 'foreach inputs as $in (null;
       if $in|length == 2
       then if $in[0][0] == "key" then .key=$in[1]
            elif $in[0][0] == "value" 
            then .emit = {key: .key, member: $in[0][1], score: $in[1]}
            else .emit=null end
       else .emit=null end;
       select(.emit) | .emit)'
peak
  • 105,803
  • 17
  • 152
  • 177