How to convert a pandas DataFrame to a list of dictionaries where NaN values are omitted?

Question

This question is similar to Pandas DataFrame to List of Dictionaries, except that the DataFrame is not 'full': there are some nan values in it. Suppose I generate a DataFrame from a list of dictionaries like so:

import pandas as pd

data = [{'foo': 1, 'bar': 2}, {'foo': 3}]
df = pd.DataFrame(data)

so that the resulting df looks like

   bar  foo
0  2.0    1
1  NaN    3

I would like a function which turns df back into the original data list of dictionaries. Unfortunately,

assert df.to_dict('records') == data

fails because the former is

[{'bar': 2.0, 'foo': 1.0}, {'bar': nan, 'foo': 3.0}]

with the additional 'bar': nan key-value pair in the second item. How can I get back the original data?

score 7 · Accepted Answer · answered Nov 29 '17 at 02:54

7

Here's another way of doing it:

df.T.apply(lambda x: x.dropna().to_dict()).tolist()

Output:

[{'bar': 2.0, 'foo': 1.0}, {'foo': 3.0}]

answered Nov 29 '17 at 02:54

Scott Boston

147,308
15
139
187

Kurt Peek · Answer 2 · 2017-11-30T20:33:53.633

1

I managed to fix the problem with some 'post-processing':

import pandas as pd

data = [{'foo': 1, 'bar': 2}, {'foo': 3}]
df = pd.DataFrame(data)

result = df.to_dict('records')
result2 = [{k: v for k, v in row.items() if not pd.isnull(v)} for row in result]

assert result2 == data

More elegant solutions are welcome.

edited Nov 30 '17 at 20:33

answered Nov 29 '17 at 02:34

Kurt Peek

52,165
91
301
526

BENY · Answer 3 · 2017-11-29T03:05:42.537

1

IIUC

1st Option

df.apply(lambda x: [x.dropna().to_dict()], axis=1).sum()
Out[860]: [{'bar': 2.0, 'foo': 1.0}, {'foo': 3.0}]

2nd Option

df.stack().groupby(level=0).apply(lambda x: [x.reset_index(level=0,drop=True).to_dict()]).sum()
Out[867]: [{'bar': 2.0, 'foo': 1.0}, {'foo': 3.0}]

edited Nov 29 '17 at 03:05

answered Nov 29 '17 at 02:55

BENY

317,841
20
164
234

How to convert a pandas DataFrame to a list of dictionaries where NaN values are omitted?

3 Answers3