Extracting from multiple and sporadic nested dicts and lists

Question

So I have data in a huge messy json file, after some formating I've got to down to 1 dict per line that I can read in.

My goal is that for each starting dict I output one row of a dataframe (Some columns are repeated from dict to dict, and some dicts add new columns).

Example of dict format:

dic = {'name': 'Simon',
       'salary': 25000,
       'children': ['Sally', 'Stuart', 'Paul'],
       'assets' : [{'houses' : ['50 cool drive', '60 swag lane']}, {'vehicles' : {'cars': 
                                                                               ['bmw', 
                                                                                'kia'],
                                                                               'boats': 
       ...}}]}

Example output:

    name     salary  children               houses          vehicles      cars      boats
0   Simon    25000   Sally, Stuart, Paul    50 cool dr...   cars, boats   bmw, kia  ...

How do I account for the changing structure from dict to list, to dict again etc.

I've tried something like:

run(thing):
    if type(thing) is not dict/list:
        df[thing] = df.get(thing)
    else:
        run(thing)

Also how do I account for when its a list within a list that has no 'column name' that I can append to the df?

I can get everything I want by continually looping and handling everything case by case, but is there not a more pythonic way to do this?

Thanks

https://stackoverflow.com/questions/1305532/convert-nested-python-dict-to-object?rq=1 Does this answer your question for getting a regular dictionary and then converting to dataframe? — Josh Zwiebel, Apr 06 '20 at 14:20
Most of the keys are the same for each dict but sometimes there are new ones (more or less) so its hard to say. — liamod, Apr 06 '20 at 14:26
You can try this https://stackoverflow.com/questions/60984799/normalize-a-complex-nested-json-file/60985664#60985664 to flatten the json and/or as suggested by @sammywemmy use `jmespath` to help with nested lists. — Raphaele Adjerad, Apr 07 '20 at 06:41
I would accept this as the answer, elegant solution to my problem, thanks! — liamod, Apr 08 '20 at 11:24

Extracting from multiple and sporadic nested dicts and lists

0 Answers0