I am trying to split a large JSON file (~4 Mio elements) into separate files (one file per element).
The file kinda looks like this:
{
"books": [
{
"title": "Professional JavaScript - \"The best guide\"",
"authors": [
"Nicholas C. Zakas"
],
"edition": 3,
"year": 2011
},
{
"title": "Professional JavaScript",
"authors": [
"Nicholas C.Zakas"
],
"edition": 2,
"year": 2009
},
{
"title": "Professional Ajax",
"authors": [
"Nicholas C. Zakas",
"Jeremy McPeak",
"Joe Fawcett"
],
"edition": 2,
"year": 2008
}
]
}
To split each book into a separate file, I am using the following command:
cat books.json | jq -c -M '.books[]' | while read line; do echo $line > temp/$(date +%s%N).json; done
For the last two items, everything's ok, because the book title does not contain any quotes. However, in the first one, the \"
get replaced by "
which leads to a broken JSON file, as the subsequent parser - of course - interprets the "
as a boundary of an element.
I've tried to use jq -r
, but that did not help.
I'm using the jq version shipped by CentOS 7:
[root@machine]$ jq --version
jq-1.6
Any suggestions?