I know there are tons of ways how to fill nans in pandas but I have found nowhere how to do the following. In my DataFrame, I have some columns (eg., name
) which have a value always at a beginning of a block, the rest is filled with nans until the next block. I would like to extend this value to the next value in the column. Note -- the number is not necessarily 1 if the name changes. Example follows:
import numpy as np
import pandas as pd
df = pd.DataFrame({
'name': np.asarray(["Bob", np.nan, np.nan, "Alice", np.nan, np.nan, np.nan]),
'number': np.asarray([1,2,3,1,2,3,4])
})
name number
0 Bob 1
1 nan 2
2 nan 3
3 Alice 1
4 nan 2
5 nan 3
6 nan 4
and what I would like to have is this:
name number
0 Bob 1
1 Bob 2
2 Bob 3
3 Alice 1
4 Alice 2
5 Alice 3
6 Alice 4
I was thinking of finding all the indices of the unique names (using df["name"].dropna().unique()
but for some reason, it still gives me nan
as a value inside despite using dropna
) and then using fillna with the limit
but I did not figure out how to change the limit based on the given name
and I could not figure out how to fill nans in a given index. The other reason I'm asking is that I believe there must be an easier and more straightforward way. Thank you.