0

I have a pd.Series that looks like this:

>>> series
0     This is a foo bar something...
1                                NaN
2                                NaN
3        foo bar indeed something...
4                                NaN
5                                NaN
6               foo your bar self...
7                                NaN
8                                NaN

How do I populate the NaN column values with the previous non NaN value in the series?

I have tried this:

new_column = []

for row in list(series):
    if type(row) == str:
        new_column.append(row)
    else:
        new_column.append(new_column[-1])

series = pd.Series(new_column)

But is there another way to do the same in pandas?

alvas
  • 115,346
  • 109
  • 446
  • 738

1 Answers1

2

From the docs:

DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

...

method : {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}, default None

Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use NEXT valid observation to fill gap

So:

series.fillna(method='ffill')

Some explanation:

  • ffill / pad: Forward fill is to use the value from previous row that isn't NA and populate the NA value. pad is just a verbose alias to ffill.

  • bfill / backfill: Back fill is to use the value from the next row that isn't NA to populate the NA value. backfill is just verbose alias to bfill.

In code:

>>> import pandas as pd
>>> import numpy as np
>>> np.NaN
nan

>>> series = pd.Series([np.NaN, 'abc', np.NaN, np.NaN, 'def', np.NaN, np.NaN])

>>> series
0    NaN
1    abc
2    NaN
3    NaN
4    def
5    NaN
6    NaN
dtype: object

>>> series.fillna(method='ffill')
0    NaN
1    abc
2    abc
3    abc
4    def
5    def
6    def
dtype: object

>>> series.fillna(method='bfill')
0    abc
1    abc
2    def
3    def
4    def
5    NaN
6    NaN
dtype: object
alvas
  • 115,346
  • 109
  • 446
  • 738