pandas: replace NaN with the last non-NaN value in column

Question

I have an excel file which lists basketball teams and the players on each team. The first row for a new team states the team name in column 0 and a player on that team in column 1. The next row simply has a player on that team in column 1 (nothing in column 0 as the team is implied from the last stated team). This is repeated for every team.

Warriors    Stephen Curry
-           Klay Thompson
-           Kevin Durant
Clippers    Chris Paul
-           Blake Griffen
-           JJ Redick
Raptors     Kyle Lowry
-           Demar Derozan

I'm importing the data into a pandas dataframe and counting the number of players on each team.

import pandas as pd
df = read_excel('data.xlsx')
print(df)

     Team        Player
0    Warriors    Stephen Curry
1    NaN         Klay Thompson
2    NaN         Kevin Durant
3    Clippers    Chris Paul
4    NaN         Blake Griffen
5    NaN         JJ Redick
6    Raptors     Kyle Lowry
7    NaN         Demar Derozan

Is there anyway I can replace NaN with the appropriate team name (I know I just need to fill in the empty spots in the excel file but it looks much cleaner if I handle this on the import or via pandas). I imagine I need to iterate through the dataframe, store the team name if it's not NaN and replace NaN with the currently stored team name until a new team arises.

If you don't know basketball, my dataframe should look like this when all is said and done:

     Team        Player
0    Warriors    Stephen Curry
1    Warriors    Klay Thompson
2    Warriors    Kevin Durant
3    Clippers    Chris Paul
4    Clippers    Blake Griffen
5    Clippers    JJ Redick
6    Raptors     Kyle Lowry
7    Raptors     Demar Derozan

Note that -- as some of the answers in the linked dup mention -- you can use `.ffill()` directly these days. — DSM, Mar 31 '17 at 00:30

Craig · Accepted Answer · 2017-03-31T00:34:55.600

16

You can do this using the fillna() method on the dataframe. The method='ffill' tells it to fill forward with the last valid value.

df.fillna(method='ffill')

edited Mar 31 '17 at 00:34

answered Mar 31 '17 at 00:29

Craig

4,605
1
18
28

pandas: replace NaN with the last non-NaN value in column

1 Answers1