Using pandas to flatten a dict

Question

   [{'name': 'Test Item1',
  'column_values': [{'title': 'col2', 'text': 'Oladimeji Olaolorun'},
   {'title': 'col3', 'text': 'Working on it'},
   {'title': 'col4', 'text': '2019-09-17'},
   {'title': 'col5', 'text': '1'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Test Item2',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2019-09-20'},
   {'title': 'col5', 'text': '2'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Test Item3',
  'column_values': [{'title': 'col2', 'text': 'David Binns'},
   {'title': 'col3', 'text': None},
   {'title': 'col4', 'text': '2019-09-25'},
   {'title': 'col5', 'text': '3'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Item 4',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Stuck'},
   {'title': 'col4', 'text': '2019-09-06'},
   {'title': 'col5', 'text': '4'}],
  'group': {'title': 'Group 2'}},
 {'name': 'Item 5',
  'column_values': [{'title': 'col2', 'text': 'David Binns'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2019-09-28'},
   {'title': 'col5', 'text': '5'}],
  'group': {'title': 'Group 2'}},
 {'name': 'item 6',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2020-03-05'},
   {'title': 'col5', 'text': '76'}],
  'group': {'title': 'Group 2'}}]

I'm currently extracting data from Monday.com's API, my call returns the response above with a dict like above I'm trying to find the best method to flatten this dict into a Dataframe.

I'm currently using json_normalize(results['data']['boards'][0]['items']) when I seem to get the result below

The desired output is a table like below

sammywemmy · Answer 1 · 2020-05-06T09:36:09.377

Using the module glom, it becomes easy to extract the required 'text' keys from the nested list. Read in the data to pandas dataframe, split the names column and finally merge back to the parent dataframe.

from glom import glom

spec = {'names':('column_values',['text']),
        'group': 'group.title',
        'Name' : 'name'
        }

the function replaces the None entry to string 'None'

def replace_none(val_list):
    val_list = ['None' if v is None else v for v in val_list]
    return val_list

for i in M:
    i['names'] = replace_none(i['names'])

df = pd.DataFrame(M)

df_split = df['names'].str.join(',').str.split(',',expand=True).add_prefix('Col')

df = df.drop('names',axis=1)

pd.concat([df,df_split],axis=1)

    group   Name         Col0                Col1              Col2   Col3
0   Group 1 Test Item1  Oladimeji Olaolorun Working on it   2019-09-17  1
1   Group 1 Test Item2  Lucie Phillips      Done            2019-09-20  2
2   Group 1 Test Item3  David Binns         None            2019-09-25  3
3   Group 2 Item 4      Lucie Phillips      Stuck           2019-09-06  4
4   Group 2 Item 5      David Binns         Done            2019-09-28  5
5   Group 2 item 6      Lucie Phillips      Done            2020-03-05  76

Update : All of the code above is, unnecessary. the code below is simpler, less verbose, and clearer.

d=[]
for ent in data:
    for entry in ent['column_values']:
        entry.update({'name':ent['name']})
        entry.update({'group':ent['group']['title']})
        d.append(entry)

res = pd.DataFrame(d)

res.set_index(['name','group','title']).unstack()

                                                               text
              title col2                col3            col4    col5
name         group              
Item 4      Group 2 Lucie Phillips      Stuck           2019-09-06  4
Item 5      Group 2 David Binns         Done            2019-09-28  5
Test Item1  Group 1 Oladimeji Olaolorun Working on it   2019-09-17  1
Test Item2  Group 1 Lucie Phillips      Done            2019-09-20  2
Test Item3  Group 1 David Binns         None            2019-09-25  3
item 6      Group 2 Lucie Phillips      Done            2020-03-05  76

score 0 · Answer 2 · answered Jun 21 '22 at 09:34

I think there are many ways to do this thing but I like the most this one. you can update the dictionary according to your need. in the below code, we are deleting unnecessary data and updating keys and values according to our need now we can convert this dict into Dataframe

 for i in range(len(d)):
    data = d[i]
    #changing names of dict 
    for value in d[i]['column_values']:
        data[value['title']] = value['text']
    data['group'] = data['group']['title']
    del(d[i]['column_values']) 
import pandas as pd 
data = pd.DataFrame(d)
data.head()

score 0 · Answer 3 · answered Jul 28 '22 at 16:28

I think this will help. first, I convert the list into dict title as key and text as value and then convert it into Series it makes a data frame for use

def solve(list_d: list) ->pd.Series:
    data = dict()
    for item in list_d:
        # taking values each items in list
        # assign title as a key and text as values 
        data[item['title']] = item['text']
    return pd.Series(data)
df['column_values'].apply(solve).join(d)

drop unnecessary columns and your dataset is ready. if you found any difficulty understanding feel free to ping me

Using pandas to flatten a dict

3 Answers3