Consider the dictionary d
:
d = {'A': {'x': 1, 'y': 1}, 'B': {'y': 1, 'z': 1}}
when I pass this to pandas.DataFrame constructor, I know I'll have missing values for row x, column B and row z, column A.
df = pd.DataFrame(d)
df
A B
x 1.0 NaN
y 1.0 1.0
z NaN 1.0
I want to those NaN
to be filled in with 0
. Of course I know I can fill it in.
df.fillna(0)
But now they are all floats
A B
x 1.0 0.0
y 1.0 1.0
z 0.0 1.0
Yes! I could have forced them to integers
df.fillna(0).astype(int)
A B
x 1 0
y 1 1
z 0 1
Or! I could have constructed a series with a clever dictionary comprehension and unstacked with a fill_value parameter
pd.Series(
{(i, j): v for j, d_ in d.items() for i, v in d_.items()}
).unstack(fill_value=0)
But all this would be a ton easier if there were a direct way to fill in missing with a default value from the start. I'd expect something like
pd.DataFrame(d, dtype=int, fill_value=0)
I know that isn't available, but is there something else I've missed?