How to make a list of dictionaries from a pandas DataFrame?

Question

I am looking to try to set up a list with specific dictionaries. I would like the structure to be something similar to the following:

[{'label': 'Abdelnaby, Alaa', 'value': '76001'},
{'label': 'Abdul-Aziz, Zaid', 'value': '76002'},
{'label': 'Abdul-Jabbar, Kareem', 'value': '76003'}]

Currently, the data that I am pulling from is in a pandas dataframe. Example below...

PlayerID    Name     Current Player First Season    Last Season
76001   Abdelnaby, Alaa       0     1990            1994
76002   Abdul-Aziz, Zaid      0     1968            1977
76003   Abdul-Jabbar, Kareem  0     1969            1988
51      Abdul-Rauf, Mahmoud   0     1990            2000
1505    Abdul-Wahad, Tariq    0     1997            2003

Please let me know if this is sufficient. Thanks so much for the help!

cs95 · Accepted Answer · 2019-04-14T03:37:43.163

5

Select your columns, rename them and call to_dict with orient='records' to get a list of dicts,

(df.reindex(['Name', 'PlayerID'], axis=1)
   .set_axis(['label', 'value'], axis=1, inplace=False)
   .to_dict('r'))    

# [{'label': 'Abdelnaby, Alaa', 'value': 76001},
#  {'label': 'Abdul-Aziz, Zaid', 'value': 76002},
#  {'label': 'Abdul-Jabbar, Kareem', 'value': 76003},
#  {'label': 'Abdul-Rauf, Mahmoud', 'value': 51},
#  {'label': 'Abdul-Wahad, Tariq', 'value': 1505}]

You can output JSON by changing .to_dict('r') to .to_json(orient='records').

If performance matters, here is an optimised solution with list comprehension construction.

[dict(zip(('label', 'value'), r)) for r in df[['Name', 'PlayerID']].values]

# [{'label': 'Abdelnaby, Alaa', 'value': 76001},
#  {'label': 'Abdul-Aziz, Zaid', 'value': 76002},
#  {'label': 'Abdul-Jabbar, Kareem', 'value': 76003},
#  {'label': 'Abdul-Rauf, Mahmoud', 'value': 51},
#  {'label': 'Abdul-Wahad, Tariq', 'value': 1505}]

edited Apr 14 '19 at 03:37

answered Apr 14 '19 at 01:34

cs95

379,657
97
704
746

Conversely, if you want to convert these list of dicts back into a DataFrame, you can see how to [here](https://stackoverflow.com/a/53831756/4909087). – cs95 Apr 14 '19 at 01:36
just curious, why use .reindex(axis =1), over just using [[ ]] and selecting the columns? – Ben Pap Apr 14 '19 at 02:29
@BenPap method chaining all the way! – cs95 Apr 14 '19 at 02:32
@coldspeed I guess dictionary comprehension might be little bit faster in this case. – BhishanPoudel Apr 14 '19 at 02:33

BhishanPoudel · Answer 2 · 2019-04-14T02:53:24.483

If speed is the issue we can use dict comp:

myjson = [{'label': name, 'value': pid} for pid,name in zip(df['PlayerID'], df['Name'])]

Gives:

[{'label': 'Abdelnaby, Alaa', 'value': 76001},
 {'label': 'Abdul-Aziz, Zaid', 'value': 76002},
 {'label': 'Abdul-Jabbar, Kareem', 'value': 76003},
 {'label': 'Abdul-Rauf, Mahmoud', 'value': 51},
 {'label': 'Abdul-Wahad, Tariq', 'value': 1505}]

Further, If you want to write the data as json:

import json
with open('myjson.json','w') as fo:
    json.dump(myjson,fo,indent=4)

Speed comparison

%%timeit
myjson = [{'label': name, 'value': pid} for pid,name in zip(df['PlayerID'].values, df['Name'].values)]

5.9 µs ± 125 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)



%%timeit
myjson = (df.reindex(['Name', 'PlayerID'], axis=1)
   .set_axis(['label', 'value'], axis=1, inplace=False)
   .to_dict('record')
)
756 µs ± 24.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

score 0 · Answer 3 · answered Apr 14 '19 at 07:48

PlayerID Name Current Player First Season Last Season 76001 Abdelnaby, Alaa 0 1990 1994 76002 Abdul-Aziz, Zaid 0 1968 1977 76003 Abdul-Jabbar, Kareem 0 1969 1988 51 Abdul-Rauf, Mahmoud 0 1990 2000 1505 Abdul-Wahad, Tariq 0 1997 2003

#You already have a data frame 
dataframe="your data frame"
#us iloc 
values=dataframe.iloc[:,0]# u have all rows n 0th col
label=dataframe.iloc[:,1] #u have all rows n 1st col
dic={}

    for i,val in values:
         dic[val]=lable[i]

#hope the logic is clear.

How to make a list of dictionaries from a pandas DataFrame?

3 Answers3

Speed comparison