1

My input is a Python list :

l = [
    {'name': 'foo', 'data': [{'id': 1}, {'type': 'type1'}, {'class': 'A'}]},
    {'name': 'bar', 'data': [{'id': 2}, {'type': 'type2'}, {'class': 'B'}]}
    ]

And my intermediate objective (maybe an XY but I need it anyways) is to make a dict like this :

new_d  = {
    'name': ['foo', 'bar'],
    'id': [1, 2],
    'type': ['type1', 'type2'],
    'class': ['A', 'B']
    }

Then my final expected output is this dataframe :

name  id  type class
 foo   1 type1     A
 bar   2 type2     B

I tried the approach below but I'm getting an error :

new_d = {}

for d in l:
    new_d = {'name': d['name'], **d['data']}

df = pd.DataFrame(new_d)

TypeError: 'list' object is not a mapping

Can you help me fixing my code please ?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • What do you need help with exactly? If you've read the error message, then you'll have realized that `d['data']` is a list of dicts, not a dict itself. If you just need help converting it to a dict, there's an existing question: [How to convert list of dictionaries to dictionary](/q/51149822/4518341) – wjandrea Aug 18 '23 at 14:37
  • Also, did you realize that the `new_d` you're creating there isn't in the same format as what you want? It's a dict of scalars instead of lists. – wjandrea Aug 18 '23 at 14:40

5 Answers5

2

Lets use ChainMap to flatten the nested list of dict

from collections import ChainMap

df = pd.DataFrame(ChainMap({'name': d['name']}, *d['data']) for d in l)

Resulting dataframe

print(df)

  name class   type  id
0  foo     A  type1   1
1  bar     B  type2   2

Intermediate dictionary

print(df.to_dict('list'))

{'name': ['foo', 'bar'],
 'class': ['A', 'B'],
 'type': ['type1', 'type2'],
 'id': [1, 2]}
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
  • 1
    Thank you so much, your solution is very elegant and is what I'm looking for. –  Aug 18 '23 at 14:39
2

You can use:

pd.DataFrame.from_records([
    {k:v for x in d['data'] for k,v in x.items()} | {'name': d['name']}
    for d in l])

Output:

   id   type class name
0   1  type1     A  foo
1   2  type2     B  bar
wjandrea
  • 28,235
  • 9
  • 60
  • 81
mozway
  • 194,879
  • 13
  • 39
  • 75
0

You can initialise a dataframe from a list of dictionaries which is closer to your current structure, you don't need to use a dictionary where the values are lists; so something like this should work:

import pandas as pd

original_list = [
    {'name': 'foo', 'data': [{'id': 1}, {'type': 'type1'}, {'class': 'A'}]},
    {'name': 'bar', 'data': [{'id': 2}, {'type': 'type2'}, {'class': 'B'}]}
]

new_list = []

for item in original_list:
    new_item = {'name': item['name']}
    for data_item in item['data']:
        new_item.update(data_item)
    new_list.append(new_item)

print(new_list)

print(pd.DataFrame(new_list))

returns:

[{'name': 'foo', 'id': 1, 'type': 'type1', 'class': 'A'}, {'name': 'bar', 'id': 2, 'type': 'type2', 'class': 'B'}]
  name  id   type class
0  foo   1  type1     A
1  bar   2  type2     B
John M.
  • 775
  • 4
  • 16
0

The problem is that within your list of dictionaries, you have the entry data, where you again have a list of dictionaries. If you were to process the data as follows, you can get the result you want without having to achieve your intermediate objective...

def parse(x):
    d = {}
    for k, v in x.items():
        if k == 'data':
            for t in v:
                d.update(t)
        else:
            d[k] = v
    return d

l = [
    {'name': 'foo', 'data': [{'id': 1}, {'type': 'type1'}, {'class': 'A'}]},
    {'name': 'bar', 'data': [{'id': 2}, {'type': 'type2'}, {'class': 'B'}]}
    ]

ll = [parse(x) for x in l]
print(pd.DataFrame(ll))
wjandrea
  • 28,235
  • 9
  • 60
  • 81
PrinsEdje80
  • 494
  • 4
  • 8
0

Another option:

l = [
    {'name': 'foo', 'data': [{'id': 1}, {'type': 'type1'}, {'class': 'A'}]},
    {'name': 'bar', 'data': [{'id': 2}, {'type': 'type2'}, {'class': 'B'}]}
    ]
dic = {'name':[], 'id':[], 'type':[], 'class':[]}

for i in l:
    for key, val in i.items():
        if key == 'name':
            dic[key].append(val)
        if key == 'data':
            x,y,z = i['data']
            dic['id'].append(x['id'])
            dic['type'].append(y['type'])
            dic['class'].append(z['class'])

print(dic)
{'name': ['foo', 'bar'], 'id': [1, 2], 'type': ['type1', 'type2'], 'class': ['A', 'B']}
wjandrea
  • 28,235
  • 9
  • 60
  • 81
ragas
  • 848
  • 2
  • 7