-1

I have list of nested dictionary objects in a JSON file. I am trying to create a DataFrame of this file.

Here are the first 2 objects:

data= [ {
    "model": "class",
    "pk": 48,
    "fields": {
        "unique_key": "9f030ed1d5e56523",
        "name": "john",
        "follower_count": 2395,
        "profile_image": "  "
}  }  ,{ 
    "model": "class",
    "pk": 49,
    "fields": {
        "unique_key": "0e8256ad7f27270eb",
        "name": "dais",
        "follower_count": 264,
        "profile_image": "   "
} }, .....]

If I try something like:

df = pd.DataFrame(data)

This is what I get.

https://d.top4top.net/p_1132pfebn1.png

I was looking for help and I found this, but the problem is the list does not have a keys() function.

karel
  • 5,489
  • 46
  • 45
  • 50
  • Please indicate your expected output. – cs95 Feb 06 '19 at 21:45
  • Possible duplicate of [Convert nested json response to dataframe in Python pandas](https://stackoverflow.com/questions/50376983/convert-nested-json-response-to-dataframe-in-python-pandas) – Danilo Cabello Feb 07 '19 at 02:51

2 Answers2

0

It looks like this is data you could flatten using a for loop:

new_data = []

for item in data:
    new_entry = {}
    for k,v in item.items():
        # a dictionary will return True for isinstance(v, dict)
        if not isinstance(v, dict):
            # v is not a dictionary here
            new_entry[k] = v
        else:
            # v is a dictionary, so we flatten it
            for a,b in v.items():
                new_entry[a] = b

    new_data.append(new_entry)

df = pd.DataFrame(new_data)

The inner loop is a more generalized approach to using something like if k=='Fields', which would be more specific to your problem

C.Nivs
  • 12,353
  • 2
  • 19
  • 44
0

Assuming you only have 1 level of nested dictionaries and you know the key name:

for d in data:
    d.update(d.pop('fields'))

You only need to "pop" the element out of the dictionary and add the inner key-value data in the base level. The update method will do the latter as an inplace operation.

Now you can create your pandas dataframe with the columns you were expecting:

In [5]: pd.DataFrame(data)
Out[5]: 
   follower_count  model  name  pk profile_image         unique_key
0            2395  class  john  48                 9f030ed1d5e56523
1             264  class  dais  49                0e8256ad7f27270eb