Not sure if this question has been asked. But I want to replace NAN values in a data frame by merging it with another. The data frame contains NAN values in certain columns. I grouped these columns by values in an id column. So in other words, the sum of all values in col1, col2, col3 by an id number.
df_group1 = df.groupby('id')[['col1']].sum()
df_group2 = df.groupby('id')[['col2']].sum()
df_group3 = df.groupby('id')[['col3']].sum()
Then I merged those three dataframe together into one.
df_group = pd.concat([df_group1, df_group2, df_group3], axis = 1)
Subsequently, I divided those values by the length of rows containing the id_number
for i in df['id'].unique():
df_group = df_group/len(df[df['id'] == i])
Now I want to merge this dataframe with df
in order to replace the NAN values in df
with those in df_group
so if a row in df_group
has id number 1111 and the respective col1 value is 200. I want to replace all NAN values in df
for all rows with id 1111 with 200. What is the best way to do this?
EDIT: Say I have this data frame df_group(Image1), I want to replace all NANs in df(Image2) with those values in df_group based on id and column name