How do I fill nans between some values in pandas DataFrame?

Question

I know there are tons of ways how to fill nans in pandas but I have found nowhere how to do the following. In my DataFrame, I have some columns (eg., name) which have a value always at a beginning of a block, the rest is filled with nans until the next block. I would like to extend this value to the next value in the column. Note -- the number is not necessarily 1 if the name changes. Example follows:

import numpy as np
import pandas as pd
df = pd.DataFrame({
    'name': np.asarray(["Bob", np.nan, np.nan, "Alice", np.nan, np.nan, np.nan]),
    'number': np.asarray([1,2,3,1,2,3,4])
})

name    number
0   Bob     1
1   nan     2
2   nan     3
3   Alice   1
4   nan     2
5   nan     3
6   nan     4

and what I would like to have is this:

name    number
0   Bob     1
1   Bob     2
2   Bob     3
3   Alice   1
4   Alice   2
5   Alice   3
6   Alice   4

I was thinking of finding all the indices of the unique names (using df["name"].dropna().unique() but for some reason, it still gives me nan as a value inside despite using dropna) and then using fillna with the limit but I did not figure out how to change the limit based on the given name and I could not figure out how to fill nans in a given index. The other reason I'm asking is that I believe there must be an easier and more straightforward way. Thank you.

IIUC, use [`df.ffill()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.ffill.html) — Ch3steR, Jan 04 '21 at 11:07
`it still gives me nan as a value inside despite using dropna` - probably you have string repr of a nan: `'nan'`, try replacing them. `df["name"].replace('nan',np.nan).dropna()` — anky, Jan 04 '21 at 11:07
the problem in your example is that you doing 'asarray' that casts all the nans to string 'nan' — adir abargil, Jan 04 '21 at 11:13

score 0 · Accepted Answer · answered Jan 04 '21 at 11:07

0

try

df['name']=df["name"].replace('nan',np.nan)
df['name'].fillna(method='ffill')

Out:

Out[85]: 
    name  number
0    Bob       1
1    Bob       2
2    Bob       3
3  Alice       1
4  Alice       2
5  Alice       3
6  Alice       4

answered Jan 04 '21 at 11:07

Suhas Mucherla

1,383
1
5
17

2

or change the initializing : `'name': ["Bob", np.nan, np.nan, "Alice", np.nan, np.nan, np.nan]` – adir abargil Jan 04 '21 at 11:13
@adirabargil Yes, since OP had initialized with np array, i had to add an extra step – Suhas Mucherla Jan 04 '21 at 11:17

How do I fill nans between some values in pandas DataFrame?

1 Answers1