2

I have a bash script that is aimed at parsing the JSON formatted data. I learned from an earlier query on this forum that bash is not good at parsing JSON, and so intend to use a python function to do the parsing:

The following is the code snippet:

#!/bin/bash

set -e

declare -a data_id
declare -a data_tid

function parse_data () {
    python <<EOF
import json
parsed = json.loads('''${@}''')

data_id = []
data_tid = []
for child in parsed['params']['children'][0]:
    print 'id: %s' % (child['data']['id'])
    data_id.append(child['data']['id'])
    print 'tid: %s' % (child['data']['tid'])
    data_tid.append(child['data']['tid'])
EOF
}

json=$(get_json_output)
parse_data $json

The data stored in 'json' would look like this:

{
    "cid": "1",
    "code": null,
    "error": null,
    "params": {
        "children": [
            {
                "value": {
                    "id": "0123456789",
                    "tid": "a.b.c"
                },
                "data": {
                    "id": "0987654321",
                    "tid": "p.q.r"
                },
                "tid": "m.n.o.p",
            }
        ],
        "tid": "m.n.o.p"
    },
    "reason": null,
    "result": true
}

I'd like the script to be able to extract the 'id' and 'tid' fields from under 'data', into separate arrays. But the script execution fails as follows:

root@ubuntu# ./abc.sh 
Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
TypeError: string indices must be integers

Any idea on what's wrong?

Maddy
  • 1,319
  • 3
  • 22
  • 37

4 Answers4

2

Leave of the [0]:

for child in parsed['params']['children']:

otherwise you are looping over the keys of the first entry in the children list.

Alternatively, if there is only ever one entry in that list, don't loop, but directly assign:

child = parsed['params']['children'][0]
print 'id: %s' % (child['data']['id'])
port_id.append(child['data']['id'])
print 'tid: %s' % (child['data']['tid'])
port_tid.append(child['data']['tid'])
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I observed that the values are not appended to the arrays data_id and data_tid, though the prints work fine. The array size is shown to be zero. – Maddy Dec 05 '13 at 13:43
  • @Maddy: Are you expecting Python lists to know about your bash arrays? That won't work at all. Subprocesses don't have access to bash variables; you'll have to make the Python process write to stdout or stderr, capture that in your bash script and go from there. – Martijn Pieters Dec 05 '13 at 13:44
  • Is there no way out? Can they not co-exist in the script and share data? – Maddy Dec 05 '13 at 13:47
  • No, bash is not known for its extensibility. Python is run as a subprocess, it has no access to the memory of the parent process, not without explicit message passing anyway. – Martijn Pieters Dec 05 '13 at 13:55
  • Why not move *everything* to a Python script instead? That'd remove the need to solve that problem at least. – Martijn Pieters Dec 05 '13 at 13:55
0

you're referring to only the first item of children list. so "child" is actually the key of the dictionary.

you should remove the [0] from the FOR loop

6160
  • 1,002
  • 6
  • 15
0

this line:

for child in parsed['params']['children'][0]:
    ...

parsed['params']['children'][0] is not a list.

change it to either

for child in parsed['params']['children']:
    ...

or

# this one only for testing
for child in [parsed['params']['children'][0]]:
    ...

or

# also for testing
child = parsed['params']['children'][0]
jpwagner
  • 553
  • 2
  • 8
0

You would find this much easier to debug if you first wrote the Python script, then tried to embed it in a bash script. Here is the debugged version:

import json, sys

parsed = json.load(sys.stdin)

port_id = []
port_tid = []
for child in parsed['params']['children']:
    print 'id: %s' % (child['data']['id'])
    port_id.append(child['data']['id'])
    print 'tid: %s' % (child['data']['tid'])
    port_tid.append(child['data']['tid'])

Second, you have a bug in your json data. I think you meant this:

{
    "cid": "1",
    "code": null,
    "error": null,
    "params": {
        "children": [
            {
                "value": {
                    "id": "0123456789",
                    "tid": "a.b.c"
                },
                "data": {
                    "id": "0987654321",
                    "tid": "p.q.r"
                },
                "tid": "m.n.o.p"
            },
            {
               "value": {
                    "id": "0987654321",
                    "tid": "a.b.c"
                },
                "data": {
                    "id": "0123456789",
                    "tid": "p.q.r"
                },
                "tid": "m.n.o.p"
            }
        ],
        "tid": "m.n.o.p"
    },
    "reason": null,
    "result": true
}

Finally, you still need to load the output into your Bash arrays. Here's my solution:

#!/bin/bash

set -e

parse_ids() { python -c '
import json, sys
parsed = json.load(sys.stdin)
print u"\n".join(c["data"]["id"] for c in parsed["params"]["children"])'
}

parse_tids() { python -c '
import json, sys
parsed = json.load(sys.stdin)
print u"\n".join(c["data"]["tid"] for c in parsed["params"]["children"])'
}

#json=$(get_json_output)
json=$(</dev/stdin)

declare -a port_id
mapfile -t port_id < <(echo "$json" | parse_ids)
echo "${port_id[@]}"

declare -a port_tid
mapfile -t port_tid < <(echo "$json" | parse_tids)
echo "${port_tid[@]}"
Michael Kropat
  • 14,557
  • 12
  • 70
  • 91