Getting a dataframe out of list of dict

Question

import pandas as pd

list_sample = [{'name': 'A', 'fame': 0, 'data': {'date':['2021-01-01', '2021-02-01', '2021-03-01'], 
                        'credit_score':[800, 890, 895],
                        'spend':[1500, 25000, 2400], 
                        'average_spend':5000}},
               {'name': 'B', 'fame': 1, 'data': {'date':['2022-01-01', '2022-02-01', '2022-03-01'],
                                   'credit_score':[2800, 390, 8900],
                                   'spend':[15000, 5000, 400], 
                                   'average_spend':3000}}]

df = pd.DataFrame()
for row in list_sample:
    name = row['name']
    fame = row['fame']
    data = row['data']
    df_temp = pd.DataFrame(data)
    df_temp['name'] = name
    df_temp['fame'] = fame
    df = pd.concat([df, df_temp])

Above is how I am getting my dataframe. Above is a dummy example, but, the issue with above is when the size of list grow and when the number of entries in each data array grow. Above takes alot of time. May be concat is the issue or something else, is there any better way to do what I am doing above (better in terms of run time !)

Does this answer your question? [Convert list of dictionaries to a pandas DataFrame](https://stackoverflow.com/questions/20638006/convert-list-of-dictionaries-to-a-pandas-dataframe) — Franciska, Feb 15 '23 at 16:41

score 1 · Answer 1 · answered Feb 15 '23 at 16:45

1

One way of doing this is to flatten the nested data dictionary that's inside the list_sample dictionary. You can do this with json_normalize.

import pandas as pd
from pandas.io.json import json_normalize

df = pd.DataFrame(list_sample)
df = pd.concat([df.drop(['data'], axis=1), json_normalize(df['data'])], axis=1)

answered Feb 15 '23 at 16:45

Mulloy

156
1
8

json_normalize takes alot of compute time. – user13744439 Feb 15 '23 at 16:45

score 0 · Answer 2 · answered Feb 15 '23 at 16:45

0

It looks like you don't care about normalizing the data column. If that's the case, you can just do df = pd.DataFrame(list_sample) to achieve the same result. I think you'd only need to do the kind of iterating you're doing if you wanted to normalize the data.

answered Feb 15 '23 at 16:45

vodolazkiy

1
3

I do care of normalizing ! – user13744439 Feb 15 '23 at 16:53

score 0 · Answer 3 · answered Feb 15 '23 at 16:47

Combine all dicts in list_sample to fit a dataframe structure and concat them at once:

df = pd.concat([pd.DataFrame(d['data'] | {'name': d['name'], 'fame': d['fame']}) 
                for d in list_sample])

print(df)

         date  credit_score  spend  average_spend name  fame
0  2021-01-01           800   1500           5000    A     0
1  2021-02-01           890  25000           5000    A     0
2  2021-03-01           895   2400           5000    A     0
0  2022-01-01          2800  15000           3000    B     1
1  2022-02-01           390   5000           3000    B     1
2  2022-03-01          8900    400           3000    B     1

@onyambu, perhaps, they are just envy and they are just miserable human beings. They think they are making fun, but they are just sneaking. — RomanPerekhrest, Feb 15 '23 at 16:59

Getting a dataframe out of list of dict

3 Answers3