4

This question is similar to Pandas DataFrame to List of Dictionaries, except that the DataFrame is not 'full': there are some nan values in it. Suppose I generate a DataFrame from a list of dictionaries like so:

import pandas as pd

data = [{'foo': 1, 'bar': 2}, {'foo': 3}]
df = pd.DataFrame(data)

so that the resulting df looks like

   bar  foo
0  2.0    1
1  NaN    3

I would like a function which turns df back into the original data list of dictionaries. Unfortunately,

assert df.to_dict('records') == data

fails because the former is

[{'bar': 2.0, 'foo': 1.0}, {'bar': nan, 'foo': 3.0}]

with the additional 'bar': nan key-value pair in the second item. How can I get back the original data?

Kurt Peek
  • 52,165
  • 91
  • 301
  • 526

3 Answers3

7

Here's another way of doing it:

df.T.apply(lambda x: x.dropna().to_dict()).tolist()

Output:

[{'bar': 2.0, 'foo': 1.0}, {'foo': 3.0}]
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
1

I managed to fix the problem with some 'post-processing':

import pandas as pd

data = [{'foo': 1, 'bar': 2}, {'foo': 3}]
df = pd.DataFrame(data)

result = df.to_dict('records')
result2 = [{k: v for k, v in row.items() if not pd.isnull(v)} for row in result]

assert result2 == data

More elegant solutions are welcome.

Kurt Peek
  • 52,165
  • 91
  • 301
  • 526
1

IIUC

1st Option

df.apply(lambda x: [x.dropna().to_dict()], axis=1).sum()
Out[860]: [{'bar': 2.0, 'foo': 1.0}, {'foo': 3.0}]

2nd Option

df.stack().groupby(level=0).apply(lambda x: [x.reset_index(level=0,drop=True).to_dict()]).sum()
Out[867]: [{'bar': 2.0, 'foo': 1.0}, {'foo': 3.0}]
BENY
  • 317,841
  • 20
  • 164
  • 234