14

I have a large JSON file that is an object of objects, which I would like to split into separate files name after object keys. Is it possible to achieve this using jq or any other off-the-shelf tools?

The original JSON is in the following format

{ "item1": {...}, "item2": {...}, ...}

Given this input I would like to produce files item1.json, item2.json etc.

peak
  • 105,803
  • 17
  • 152
  • 177
kissaprofeetta
  • 348
  • 1
  • 2
  • 9
  • Do you want to convert it to different files are different variables? There are so many ways, with which you can convert into different variables. – Rajeshwar Feb 26 '15 at 14:26
  • 1
    I want to convert each object represented by its own key to a separate file. Is there any way to do it with jq or similar tools? – kissaprofeetta Feb 26 '15 at 20:43
  • You can only generate one output at a time. Just make up a script that would get all the item names, then fork out jq calls to get those items out and save to a file. – Jeff Mercado Feb 26 '15 at 21:29

3 Answers3

17

This should give you a start:

for f in `cat input.json | jq -r 'keys[]'` ; do
  cat input.json | jq ".$f" > $f.json
done

or when you insist on more bashy syntax like some seem to prefer:

for f in $(jq -r 'keys[]') ; do
  jq ".[\"$f\"]" < input.json > "$f.json"
done < input.json
Hans Z.
  • 50,496
  • 12
  • 102
  • 115
  • 2
    To ensure proper behavior in the presence of any key name, it would be necessary to write ".[\"$f\"]" rather than ".$f" – peak Jan 06 '17 at 10:06
  • `cat file | command` is anti-pattern, backticks is another one – RomanPerekhrest May 15 '18 at 15:13
  • `jq --arg f "$f" '.[$f]'` would be a further improvement, preventing arbitrary code injection (granted, not a big security issue until jq gets a `system()` call, or a way to open files for write, or other facilities that it doesn't currently have) by ensuring that the substituted value can only ever be treated as a literal string. – Charles Duffy Feb 17 '19 at 04:51
  • `jq -n --argjson f "$f" '$f'` will also guarantee a good parsing; `-n ` is important if your not piping – Thiago Conrado Dec 15 '20 at 21:03
8

Here's a solution that requires only one call to jq:

jq -cr 'keys[] as $k | "\($k)\n\(.[$k])"' input.json |
  while read -r key ; do
    read -r item
    printf "%s\n" "$item" > "/tmp/$key.json"
  done

It might be faster to pipe the output of the jq command to awk, e.g.:

jq -cr 'keys[] as $k | "\($k)\t\(.[$k])"' input.json |
  awk -F\\t '{ print $2 > "/tmp/" $1 ".json" }'

Of course, these approaches will need to be modified if the key names contain characters that cannot be used in filenames.

peak
  • 105,803
  • 17
  • 152
  • 177
1

Is it possible to achieve this using jq or any other off-the-shelf tools?

It is. can also do this very efficiently.

Let's assume 'input.json' :

{
  "item1": {
    "a": 1
  },
  "item2": {
    "b": 2
  },
  "item3": {
    "c": 3
  }
}

Inefficient Bash method:

for f in $(xidel -s input.json -e '$json()'); do
  xidel -s input.json -e '$json("'$f'")' > $f.json
done

For every object key another instance of xidel is called to parse the object. Especially when you have a very large JSON this is pretty slow.

Efficient file:write() method:

xidel -s input.json -e '
  $json() ! file:write(
    .||".json",
    $json(.),
    {"method":"json"}
  )
'

One xidel call creates 'item{1,2,3}.json'. Their content is a compact/minified object, like {"a": 1} for 'item1.json'.

xidel -s input.json -e '
  for $x in $json() return
  file:write(
    concat($x,".json"),
    $json($x),
    {
      "method":"json",
      "indent":true()
    }
  )
'

One xidel call creates 'item{1,2,3}.json'. Their content is a prettified object (because of {"indent":true()}), like...

{
  "a": 1
}

...for 'item1.json'. Different query (for-loop), same result.

This method is multitudes faster!

Reino
  • 3,203
  • 1
  • 13
  • 21