17

I am using Python; and I need to iterate through JSON objects and retrieve nested values. A snippet of my data follows:

 "bills": [
{
  "url": "http:\/\/maplight.org\/us-congress\/bill\/110-hr-195\/233677",
  "jurisdiction": "us",
  "session": "110",
  "prefix": "H",
  "number": "195",
  "measure": "H.R. 195 (110\u003csup\u003eth\u003c\/sup\u003e)",
  "topic": "Seniors' Health Care Freedom Act of 2007",
  "last_update": "2011-08-29T20:47:44Z",
  "organizations": [
    {
      "organization_id": "22973",
      "name": "National Health Federation",
      "disposition": "support",
      "citation": "The National Health Federation (n.d.). \u003ca href=\"http:\/\/www.thenhf.com\/government_affairs_federal.html\"\u003e\u003ccite\u003e Federal Legislation on Consumer Health\u003c\/cite\u003e\u003c\/a\u003e. Retrieved August 6, 2008, from The National Health Federation.",
      "catcode": "J3000"
    },
    {
      "organization_id": "27059",
      "name": "A Christian Perspective on Health Issues",
      "disposition": "support",
      "citation": "A Christian Perspective on Health Issues (n.d.). \u003ca href=\"http:\/\/www.acpohi.ws\/page1.html\"\u003e\u003ccite\u003ePart E - Conclusion\u003c\/cite\u003e\u003c\/a\u003e. Retrieved August 6, 2008, from .",
      "catcode": "X7000"
    },
    {
      "organization_id": "27351",
      "name": "Natural Health Roundtable",
      "disposition": "support",
      "citation": "Natural Health Roundtable (n.d.). \u003ca href=\"http:\/\/naturalhealthroundtable.com\/reform_agenda\"\u003e\u003ccite\u003eNatural Health Roundtable SUPPORTS the following bills\u003c\/cite\u003e\u003c\/a\u003e. Retrieved August 6, 2008, from Natural Health Roundtable.",
      "catcode": "J3000"
    }
  ]
},

I need to go through each object in "bills" and get "session", "prefix", etc. and I also need go through each "organizations" and get "name", "disposition", etc. I have the following code:

import csv
import json

path = 'E:/Thesis/thesis_get_data'

with open (path + "/" + 'maplightdata110congress.json',"r") as f:
data = json.load(f)
a = data['bills']
b = data['bills'][0]["prefix"]
c = data['bills'][0]["number"]

h = data['bills'][0]['organizations'][0]
e = data['bills'][0]['organizations'][0]['name']
f = data['bills'][0]['organizations'][0]['catcode']
g = data['bills'][0]['organizations'][0]['catcode']

for i in a:
    for index in e:
          print ('name')

and it returns the string 'name' a bunch of times.

Suggestions?

Collective Action
  • 7,607
  • 15
  • 45
  • 60

4 Answers4

24

This might help you.

def func1(data):
    for key,value in data.items():
        print (str(key)+'->'+str(value))
        if type(value) == type(dict()):
            func1(value)
        elif type(value) == type(list()):
            for val in value:
                if type(val) == type(str()):
                    pass
                elif type(val) == type(list()):
                    pass
                else:
                    func1(val)
func1(data)

All you have to do is to pass the JSON Object as Dictionary to the Function.

There is also this python library that might help you with this.You can find this here -> JsonJ

PEACE BRO!!!

Joish
  • 1,458
  • 18
  • 23
  • Getting some output but mostly errors: for key,value in data.items(): AttributeError: 'unicode' object has no attribute 'items' – Shaze Oct 22 '20 at 16:14
  • @Shaze, can u please cross check if the data is of type dict or something else.. – Joish Oct 22 '20 at 16:27
22

I found the solution on another forum and wanted to share with everyone here in case this comes up again for someone.

import csv
import json

path = 'E:/Thesis/thesis_get_data'

with open (path + "/" + 'maplightdata110congress.json',"r") as f:
data = json.load(f)

for bill in data['bills']:
    for organization in bill['organizations']:
        print (organization.get('name'))`
Collective Action
  • 7,607
  • 15
  • 45
  • 60
9

refining to @Joish's answer

def func1(data):
    for key,value in data.items():
        print (str(key)+'->'+str(value))
        if isinstance(value, dict):
            func1(value)
        elif isinstance(value, list):
            for val in value:
                if isinstance(val, str):
                    pass
                elif isinstance(val, list):
                    pass
                else:
                    func1(val)
func1(data)

Same as implemented here

Hemant Rakesh
  • 126
  • 1
  • 5
5

This question is double nested so two for loops makes sense.

Here's an extract from Pluralsight using their GraphGL with an example that goes three levels deep to get either Progress, User or Course info:

{
  "data": {
    "courseProgress": {
      "nodes": [
        {
          "user": {
            "id": "1",
            "email": "a@a.com",
            "startedOn": "2019-07-26T05:00:50.523Z"
          },
          "course": {
            "id": "22",
            "title": "Building Machine Learning Models in Python with scikit-learn"
          },
          "percentComplete": 34.0248,
          "lastViewedClipOn": "2019-07-26T05:26:54.404Z"
        }
      ]
    }
  }
}

The code to parse this JSON:

for item in items["data"]["courseProgress"]["nodes"]:
    print(item["user"].get('email'))
    print(item["course"].get('title'))
    print(item.get('percentComplete'))
    print(item.get('lastViewedClipOn'))
Jeremy Thompson
  • 61,933
  • 36
  • 195
  • 321