I have a large pandas dataframe with several NaN
values in different columns. Each NaN
value have an associated ID
, I would like to impute those NaN
values with the associated id value. For example, consider:
ID COL
1 23
1 NaN
1 NaN
1 NaN
1 NaN
2 21
2 NaN
2 NaN
2 NaN
3 25
3 NaN
3 NaN
As you can see 1 is associated to 23, therefore all the ids that have 1 must be imputed with 23 and so one for the other cases. For example, the expected output would be:
ID COL
1 23
1 23
1 23
1 23
1 23
2 21
2 21
2 21
2 21
3 25
3 25
3 25
How can I do such operation with pandas?, my problem is that I do not know how to handle the previous value and replace it with the its id.
UPDATE
After reading the answers from this question and other associated questions I tried to:
df.sort_values(['ID','COL']).ffill()
However is not working. It is not replacing the values with those associated to the IDs, the reason is that maybe my COL values are strings. Any idea of how to deal with this?