0

I have a database dump that consists of one huge JSON tree. I want to extract a specific subtree that will be much smaller than the rest, with a known specific key.

{ "key1": { subtree1... }, "key2": { subtree2... }, ... }

How do I extract subtreeN with streaming jq?

peak
  • 105,803
  • 17
  • 152
  • 177
stanm87
  • 600
  • 4
  • 16
  • Possible duplicate of [Process large JSON stream with jq](https://stackoverflow.com/questions/39232060/process-large-json-stream-with-jq) – Zim84 Mar 06 '19 at 15:36

2 Answers2

2

In the following, we'll assume $key holds the key of interest.

The key to efficiency here is to terminate once the processing of the stream produced by the --stream option completes handling the $key key. To do so, we can define a helper function as follows. Notice that it uses inputs, and hence the invocation of jq must use the -n command-line option.

# break out early
def filter($key):
  label $out
  | foreach inputs as $in ( null;
      if . == null
      then if $in[0][0] == $key then $in
           else empty
           end
      elif $in[0][0] != $key then break $out
      else $in
      end;
      select(length==2) );

The reconstruction of the desired key-value pair can now be accomplished as follows:

reduce filter($key) as $in ({};
  setpath($in[0]; $in[1]) )

Example input.json

{
  "key1": {
    "subtree1": {
    "a": {"aa":[1,2,3]}
    }
  },
  "key2": {
    "subtree2": {
        "b1":  {"bb":[11,12,13]},
        "b2":  {"bb":[11,12,13]}
    }
  },
  "key3": {
    "subtree3": {
      "c":  {"cc":[21,22,23]}
    }
  }
}

Illustration

jq -n -c --arg key "key2" --stream -f extract.jq input.json

Output

{"key2":{"subtree2":{"b1":{"bb":[11,12,13]},"b2":{"bb":[11,12,13]}}}}
peak
  • 105,803
  • 17
  • 152
  • 177
  • amazing, thank you so much! This is much better than my previous solution using the `atomize` function provided in the docs. – stanm87 Mar 07 '19 at 08:44
0

Here’s a simple one-liner using jq’s —-stream option:

jq —-stream 'first(fromstream(select(.[0][0]=="key2"), [["key2"]]))'
peak
  • 105,803
  • 17
  • 152
  • 177