0

I want to get the log of lagged value of a column divided by that column in pandas.

There's a simple way of doing this by adding a new column like this:

import pandas as pd
from numpy import log

df = pd.DataFrame({'reading': [2,3,4,5,6,7,8]})

df['lagged'] = df.reading.shift(1)
df['log'] = df.apply(lambda x: log(x['lagged']/ x['reading']), axis=1)

I'm wondering if there's a simpler way to do this without adding a new column.

Mehdi Zare
  • 1,221
  • 1
  • 16
  • 32

1 Answers1

0

You are right, new column is not necessary, also apply here is redundant too. Divide Series.shifted column and then pass log to Series for improve performance:

df['log'] = log(df.reading.shift() / df.reading)
print (df)
   reading       log
0        2       NaN
1        3 -0.405465
2        4 -0.287682
3        5 -0.223144
4        6 -0.182322
5        7 -0.154151
6        8 -0.133531
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks @jezrael! Quick follow up, how can I account for cases that denominator is zero? I can use if else structure in lambda and use np.nan in else, is there a similar syntax here? – Mehdi Zare Mar 21 '20 at 15:35
  • @MehdiZare - Tested and then got `inf` or `-inf`, so should be replace by `NaN`s after my solution, check [this](https://stackoverflow.com/a/49814125/2901002) – jezrael Mar 21 '20 at 15:45