1

I have a giant JSON string that I am trying to parse through the command line.

Here is an example:

"Product_ID":"productID_1","Price":"$4.99","Cover_Image":"cover.jpg"},{"issue_id":"2","total_article":"36","issue_number":"4","issue_name":"","volume":"57","editors":"","date_of_release":"2013-04-01"

There are multiple issues listed, all with the same fields. I need to pull out the product ID, price and date release from each article in the JSON and write it to a text file from the command line.

I was thinking awk was the way to go but, since it reads it in as one line, I have found it to be tricky.

Any suggestions/code examples to pull these fields out?

user1978536
  • 95
  • 1
  • 3
  • 5

4 Answers4

1

Check out sql4json (http://github.com/bheni/sql4json)

As already noted, that isn't valid json. If you had a file named inputfile.json with the contents:

[
   {"Product_ID":"productID_1","Price":"$1.99","Cover_Image":"cover1.jpg","issue_id":"4","total_article":"16","date_of_release":"2013-04-01"},
   {"Product_ID":"productID_2","Price":"$2.99","Cover_Image":"cover2.jpg","issue_id":"5","total_article":"26","date_of_release":"2013-04-02"},
   {"Product_ID":"productID_3","Price":"$3.99","Cover_Image":"cover3.jpg","issue_id":"6","total_article":"36","date_of_release":"2013-04-03"}
]

This should give you what you are looking for:

sql4json --csv 'SELECT Price, Product_ID, date_of_release' <inputfile.json >outputfile.csv

The contents of outputfile.csv would be:

$1.99,productID_1,2013-04-01
$2.99,productID_2,2013-04-02
$3.99,productID_3,2013-04-03
bheni
  • 11
  • 1
  • This will still work if the json is all on a single line... Even if stackoverflow doesn't. – bheni Dec 17 '13 at 06:57
0

As a rule... try to resist and avoid the urge to parse things yourself. Your regex might work once or twice with your example XML or JSON or what not, but it's so so risky!

Here's a line using Python without a script to pull out a single value:

davanbri@om: echo $J
{"a":"fish"}
davanbri@om: python -c "import json;print json.loads('$J')['a']"
fish

Might work for your needs! Or write an actual python script and get out all the values you need.

david van brink
  • 3,604
  • 1
  • 22
  • 17
0

You most likely have python already installed, if not, it won't hurt to install it.

printjson.py

import json;
import sys;
for obj in json.loads(sys.stdin.readline()):
    print obj['Product_ID']
    print obj['Price']

usage

$ echo '[{"Product_ID":"productID_1","Price":"$4.99","Cover_Image":"cover.jpg"}]' | python printjson.py > outfile

$ cat outfile 
productID_1
$4.99
000
  • 26,951
  • 10
  • 71
  • 101
-1

There is a fantastic tool that works well in conjunction with jq. It is often that a gigantic files need to be filters. So my workflow is i parse with https://github.com/ilyash/show-struct/ this is an some little tool that will turn you your huge json into something like this:

.Records -- (Array of 3 elements) .Records[] .Records[].awsRegion -- us-east-1 .Records[].eventName -- DescribeInstances1 .Records[].eventSource -- ec2.amazonaws.com`

then you can use jq, so for example.

aws ec2 describe-* | jq .Records[].eventName

There is a toon of options with JQ, do more stuff but this works for me.