Hopefully this example speaks for itself. I want to create 'lagval' with shift() but need it to be nan if the prior year is missing.
df = DataFrame( { 'yr' : [2007,2008,2009,2011,2012],
'val': np.random.randn(5) } )
Desired output (lagval):
In [1118]: df
Out[1118]:
val yr lagval
0 -0.978139 2007 NaN
1 0.117912 2008 -0.978139
2 -1.031884 2009 0.117912
3 0.606856 2011 NaN
4 -0.200864 2012 0.606856
I have a decent solution for this (posted as an answer), but am looking for alternatives. I have spent some time looking at all the time series functions but that seems like overkill here. It seems like I would end up converting year to a true timestamp, resampling, shifting, and then dropping missing values. But maybe there is a simpler way?