2

In Bash shell script, I want to extract an object. For example, with following json file, I would like to extract dependencies object and it should return me: "dmg": ">= 0.0.0", "build-essential": ">= 0.0.0", "windows": ">= 0.0.0" in whatever format and how do you do that?

// My data 1.json:

{
    "platforms": {
        "amazon": ">= 0.0.0",
        "arch": ">= 0.0.0",
        "centos": ">= 0.0.0",
        "debian": ">= 0.0.0"
    },
    "dependencies": {
        "dmg": ">= 0.0.0",
        "build-essential": ">= 0.0.0",
        "windows": ">= 0.0.0"
    },
    "recommendations": {}
}

// My data 2.json:

{
    "platforms": {
        "amazon": ">= 0.0.0",
        "arch": ">= 0.0.0",
        "centos": ">= 0.0.0",
        "debian": ">= 0.0.0"
    },
    "recommendations": {},
    "dependencies": {
        "dmg": ">= 0.0.0",
        "build-essential": ">= 0.0.0",
        "windows": ">= 0.0.0"
    }
}

// My data 3.json:

{
    "dependencies": {
        "dmg": ">= 0.0.0",
        "build-essential": ">= 0.0.0",
        "windows": ">= 0.0.0"
    },
    "platforms": {
        "amazon": ">= 0.0.0",
        "arch": ">= 0.0.0",
        "centos": ">= 0.0.0",
        "debian": ">= 0.0.0"
    },
    "recommendations": {}
}

// My data 4.json:

{
    "dependencies": {
        "dmg": ">= 0.0.0",
        "build-essential": ">= 0.0.0",
        "windows": ">= 0.0.0"
    }
}

// My data 5.json (compress):

{"dependencies":{"dmg":">= 0.0.0","build-essential":">= 0.0.0","windows":">= 0.0.0"},"platforms":{"amazon":">= 0.0.0","arch":">= 0.0.0","centos":">= 0.0.0","debian":">= 0.0.0"},"recommendations":{}}
Nam Nguyen
  • 5,668
  • 14
  • 56
  • 70

4 Answers4

1

Have you looked at jsawk? I would generally use python for parsing JSON data on UNIX systems, since it usually comes bundled with the OS.

Anyways, you can try this:

awk "/dependencies/,/}/ { print }" test.json | grep ":" | grep -v dependencies

in general, to get text between two patterns/strings:

awk "/Pattern1/,/Pattern2/ { print }" inputFile

and then use grep ":" to get all the lines containing the ':' in the object, and then filter out the object name itself by getting all the subsequent lines not containing the object name

UPDATE: for json not in pretty format

sed "s/[,{}]/&\n/g" prettified.json | awk "/dependencies/,/}/ { print }" | grep ":" | grep -v dependencies | awk '{$1=$1}1'
Nayeem Zen
  • 2,569
  • 1
  • 17
  • 16
0

Here is one way with awk:

awk -v RS= -F'},|{' '{print $5}' file | awk 'NF'

$ awk -v RS= -F'},|{' '{print $5}' f | awk 'NF'
    "dmg": ">= 0.0.0",
    "build-essential": ">= 0.0.0",
    "windows": ">= 0.0.0"
jaypal singh
  • 74,723
  • 23
  • 102
  • 147
  • I'm curious, how does it know the block named "dependencies" to extract? I don't see the keyword "dependencies" in your command. – Nam Nguyen Mar 02 '14 at 06:43
  • @NamNguyen I set the input to paragraph mode `RS=` and the field separator to `},` or `{`. Once the file is split I just pick the field you seek by stating the `$5`. That is the field holding value you need. The last pipe is to remove blank lines. – jaypal singh Mar 02 '14 at 06:45
  • it's very sort and I'm having hard time to understand your solution. I'm still reading your commend and learn from pro :) . Kinna like your solution :) – Nam Nguyen Mar 02 '14 at 06:48
  • @NamNguyen Thanks `:)`. It's pretty simple. Just look at your input and count the fields by splitting them at every `},` or `{`. You'll see your block is the `5th` field. – jaypal singh Mar 02 '14 at 06:49
0
$ $ tr -d '\n' < myjson.json | sed -e's/[}{]//g' | sed -e's/.*dependencies\":\(.*\)\s*,.*/\1/g' | sed -e's/^ *//g' | sed -e's/, */, /g'
"dmg": ">= 0.0.0", "build-essential": ">= 0.0.0", "windows": ">= 0.0.0"
Red Cricket
  • 9,762
  • 21
  • 81
  • 166
-1
sed -n '/dependencies/, /}/ p' t|grep '>='


How this works :

First get the text between dependencies block, and then extract the dependencies.

Note that this method is independent of where in the text the dependency block is located. As long as it is present, you'll get the answer.


aman@apollo:~$ sed -n '/dependencies/, /\}/ p' t|grep '>='
        "dmg": ">= 0.0.0",
        "build-essential": ">= 0.0.0",
        "windows": ">= 0.0.0"

Use sed -n '/dependencies/, /}/ p' t|grep '.*='

If there can be symbols like ~=, = in the dependency block (and not just >=).


Compressed version For the compressed version, you can first "decompress" (insert newlines) the file and then apply the same transformation.
sed -e 's/:{/:{\n/g'  -e  's/},/\n},\n/g' d5|sed -n '/dependencies/, /}/ p'|grep '>='

The original solution will work for all the 4 other files.

axiom
  • 8,765
  • 3
  • 36
  • 38