I am pretty new to python (mostly I use R) and I would like to perform a simple calculation but keep getting errors and incorrect results. I would like to calculate the percentage change for a column in a pandas df using the latest non-na value. A toy example is below.
price = ['Nan', 10, 13, 'NaN', 'NaN', 9]
df = pd.DataFrame(price, columns = ['price'])
df['price_chg'] = df.price.pct_change(periods = -1)
I keep getting a weird result:
price_chg = [Nan, -0.2307, 0, 0, 0.4444, NaN]
I guess this has to do with the Nan values. How do I tell python to use the latest non-na value. The desired result is as follows:
price_chg = [Nan, -0.2307, 0.4444, 0, 0, NaN]
Since I don't know very much python at all, any suggestions would be welcome, even more convoluted ones.