Each column col1
, col2
and col3
need to have either a forward or a backward fill conditional on column col4
.
Say I have a dataframe like so:
df = pd.DataFrame({'col1':[1,np.nan,3, np.nan,5,np.nan],
'col2':[7, np.nan, 9, np.nan, 11, np.nan],
'col3':[13, 14, 15, 16, np.nan,18],
'col4':[2015, 2015, 2015, 2016, 2016, 2018]})
col1 col2 col3 col4
0 1.0 7.0 13.0 2015
1 NaN NaN 14.0 2015
2 3.0 9.0 15.0 2015
3 NaN NaN 16.0 2016
4 5.0 11.0 NaN 2016
5 NaN NaN 18.0 2018
I am trying to implement this through groupby.
grouped = df.groupby('col4')
Then I loop over groups and if a group name meets a condition I do a fill (backward or forward) and update the dataframe.
for name, group in grouped:
if name == 2015:
df[df.col4==name][['col1', 'col2']] = grouped.get_group(name)[['col1', 'col2']].ffill(axis=0)
elif name == 2016:
df[df.col4==name]['col1'] = grouped.get_group(name)['col1'].ffill(axis=0)
df[df.col4==name][['col2', 'col3']] = grouped.get_group(name)[['col1', 'col2']].bfill(axis=0)
else:
df[df.col4==name]['col1', 'col2', 'col3'] = grouped.get_group(name)['col1'].bfill(axis=0)
However this does not work and looks very lengthy.
This post looks similar.
I would appreciate any suggestions.