-1

I've been some trouble getting JSON code into a pandas dataframe in python. This is what my JSON code looks like:

    {
    "results": [
        {
            "events": [
                {
                    "id": 132,
                    "name": "rob",
                    "city": "nyc",
                    "age": 55
                },
                {
                    "id": 324,
                    "name": "sam",
                    "city": "boston",
                    "age": 35,
                    "favColor": "green"
                },
                                {
                    "id": 556,
                    "name": "paul",
                    "age": 23,
                    "favColor": "blue"
                },
                                {
                    "id": 635,
                    "name": "kyle",
                    "city": "nyc"
                }
            ]
        }
    ],
    "responseinfo": {
        "inspectedCount": 295822,
        "omittedCount": 0,
        "matchCount": 119506,
        "wallClockTime": 34
    }
}

I'm only trying to create a dataframe out of the data inside the events node and create columns of the keys. In some of these keys are missing however, so these would all have to be merged together to make sure all keys/columns exist. I tried cycling through each node populating a dictionary and then merging these but I cant figure it out. Any ideas how I can tackle this? Thanks! Rob

robs
  • 649
  • 4
  • 13
  • 28
  • 1
    You have posted the *data* you are using; you have not posted any of the code that is trying to *use* this data, which is presumably what you want fixed. – Scott Hunter Jul 14 '20 at 15:37
  • Sorry, I never posted it because my approach was wrong and was hoping someone came up with a more elegant solution which "Recursing" ended up posting. I tried creating an empty table and cycling through each node and writing keys and values into each column/row of the table. The problem with this is that i'm creating the table structure based on the first node, so if there was a key missing it wouldnt create it in the table. I knew there had to be a better way haha – robs Jul 14 '20 at 16:47

1 Answers1

1

You can try to use the json module from the standard library to parse the json data, then converting the list of dicts to a Dataframe, like this:

import json
import pandas as pd

json_data = """ {
    "results": [
        { ..."""
 

data = json.loads(json_data)
events = data["results"][0]["events"]

df = pd.DataFrame(events)
Recursing
  • 578
  • 2
  • 9
  • Thanks that did it! cant believe it was so easy i spent so much time trying to populate a table by cycling through the json nodes but then i ended having missing columns – robs Jul 14 '20 at 16:36