0

I'm trying to parse nested json results.

data = {
"results": [
        {
            "components": [
                {
                    "times": {
                            "periods": [
                                {
                                    "fromDayOfWeek": 0,
                                    "fromHour": 12,
                                    "fromMinute": 0,
                                    "toDayOfWeek": 4,
                                    "toHour": 21,
                                    "toMinute": 0,
                                    "id": 156589,
                                    "periodId": 20855
                                }
                            ],
                        }
                    }
                ],
            }
        ],
    }

I can get to and create dataframes for "results" and "components" lists, but cannot get to "periods" due to the "times" dict. So far I have this:

df = pd.json_normalize(data, record_path = ['results','components'])

Need a separate "periods" dataframe with the included column names and values. Would appreciate your help on this. Thank you!

Tentek
  • 1
  • 1
  • Why not `df = pd.json_normalize(data, record_path = ['results','components', 'times', 'periods'])`? – jlesuffleur Mar 11 '21 at 08:20
  • @jlesuffleur you're right, it works. i simplified this example here, but learned that the error wasn't from the json.normalize, but from the loop. thanks for the comment! – Tentek Mar 12 '21 at 09:15

1 Answers1

0
  • I results
  • II components
  • III times
  • IIII periods

The normalize should be correct way:

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

There is 4 level of nesting. There can be x components in results and y times in components - however that type of nesting is overengineering?

The simplest way of getting data is:

print data['a']['b']['c']['d'] (...)

in your case:

print data['results']['components']['times']['periods']


You can access the specific nested level by this piece of code:

def GetPropertyFromPeriods (property):
    propertyList = []
    for x in data['results']['components']['times']:
        singleProperty = photoURL['periods'][property]
        propertyList.append(singleProperty)
    return propertyList

This give you access to one property inside periods (fromDayOfWeek, fromHour, fromMinute)

After coverting json value, transform it into pandas dataframe:

print pd.DataFrame(data, columns=["columnA", "columnB”])

If stuck:

How to Create a table with data from JSON output in Python

Python - How to convert JSON File to Dataframe

pandas documentation:

  • pandas.DataFrame.from_dict
  • pandas.json_normalize
Piotr Żak
  • 2,046
  • 5
  • 18
  • 30
  • 1
    Hey Piotr, thanks for the answer. You're correct, it worked. My actual issue, unfortunately, isn't related to the nesting as I just learned, but from looping. I simplified the example here, but my actual code has a loop and it was throwing error for missing nests, which I'll have to solve first. Thank you for your answer tho - appreciate it! – Tentek Mar 12 '21 at 09:17
  • the error was due to: the "times" sometimes doesn't have the "periods" list in it, so the loop I guess was not finding "periods" when I was iterating and trying to get to it. – Tentek Mar 17 '21 at 02:08