7

I have a Pandas dataframe as follows:

df =

                      open       high        low      close
Timestamp                                                      
2014-01-07 13:18:00  874.67040  892.06753  874.67040  892.06753
2014-01-07 13:19:00        NaN        NaN        NaN        NaN
2014-01-07 13:20:00        NaN        NaN        NaN        NaN
2014-01-07 13:21:00  883.23085  883.23085  874.48165  874.48165
2014-01-07 13:22:00        NaN        NaN        NaN        NaN

For each of the NaN's, they should take the value of the previous period's close.

Edit: I have tried using df.fillna(method='ffill') but it makes each NaN take values directly above it. I would like each NaN to take only the value of Close before it.

Using ffill yields:

                      open       high        low      close
Timestamp                                                      
2014-01-07 13:18:00  874.67040  892.06753  874.67040  892.06753
2014-01-07 13:19:00  874.67040  892.06753  874.67040  892.06753

But I am looking for:

                      open       high        low      close
Timestamp                                                      
2014-01-07 13:18:00  874.67040  892.06753  874.67040  892.06753
2014-01-07 13:19:00  892.06753  892.06753  892.06753  892.06753
Adam
  • 276
  • 3
  • 15

2 Answers2

5

Couple of ways:

In [3166]: df.apply(lambda x: x.fillna(df.close.shift())).ffill()
Out[3166]:
                          open       high        low      close
Timestamp
2014-01-07 13:18:00  874.67040  892.06753  874.67040  892.06753
2014-01-07 13:19:00  892.06753  892.06753  892.06753  892.06753
2014-01-07 13:20:00  892.06753  892.06753  892.06753  892.06753
2014-01-07 13:21:00  883.23085  883.23085  874.48165  874.48165
2014-01-07 13:22:00  874.48165  874.48165  874.48165  874.48165

In [3167]: df.fillna({c: df.close.shift() for c in df}).ffill()
Out[3167]:
                          open       high        low      close
Timestamp
2014-01-07 13:18:00  874.67040  892.06753  874.67040  892.06753
2014-01-07 13:19:00  892.06753  892.06753  892.06753  892.06753
2014-01-07 13:20:00  892.06753  892.06753  892.06753  892.06753
2014-01-07 13:21:00  883.23085  883.23085  874.48165  874.48165
2014-01-07 13:22:00  874.48165  874.48165  874.48165  874.48165
cs95
  • 379,657
  • 97
  • 704
  • 746
Zero
  • 74,117
  • 18
  • 147
  • 154
  • Zero. When I do this, I get the same number in every single row and column, e.g. `892.06753` for every nan value, including where it should be `874.48165`. It's like it takes the first previous number in Close and uses it in the entire dataframe. Any idea why? – Chuck Apr 30 '18 at 10:13
  • https://stackoverflow.com/questions/45262040/python-pandas-forward-filling-entire-rows-with-value-of-one-previous-column I had to use this method. – Chuck Apr 30 '18 at 10:24
1

You can fill the close and then backfill the rest on axis 1:

df.close.fillna(method='ffill', inplace=True)
df.fillna(method='backfill', axis=1, inpace=True)
chthonicdaemon
  • 19,180
  • 2
  • 52
  • 66