I am relatively new to python and pandas and I am trying to perform a group-wise operation using apply but struggle to get it working.
My data frame looks like this:
Year Country Val1 Val2 Fact
2005 A 1 3 1
2006 A 2 4 2
2007 A 3 5 2
2008 A 4 3 1
2009 A 4 3 1
2010 A 4 3 1
2005 B 5 7 2
2006 B 6 6 2
2007 B 7 5 1
2008 B 8 6 2
2009 B 8 6 2
2010 B 8 6 2
For each country in each year, I need to calculate (country mean for period 2005-2008 - value in 2005)/4 * Fact * (Year - 2005) + value in 2005
So far I read up on the use of apply and transform and looked at questions related to the use of both functions (e.g.1 and 2) and I thought that my problem can be solved by using a group wise apply.
I tried to set it up like so:
import pandas as pd
df = pd.DataFrame({'Year' : [2005, 2006, 2007, 2008, 2009, 2010, 2005, 2006, 2007, 2008, 2009, 2010],
'Country' : ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'],
'Val1' : [1, 2, 3, 4, 4, 4, 5, 6, 7, 8, 8, 8],
'Val2' : [3, 4, 5, 3, 3, 3, 7, 6, 5, 6, 6, 6,],
'Fact' : [1, 2, 2, 1, 1, 1, 2, 2, 1, 2, 2, 2]
})
def func(grp):
grad = grp[(grp['Year'] > 2004) & (grp['Year'] < 2009)].transform('mean')
ref = grp[grp['Year'] == 2005]
grad = (grad - ref)/4
res = grad * grp['Fact'] * (grp['Year']-2015) * ref
return res
df.groupby('Country').apply(func)
Running the code yields
Country Fact Val1 Val2 Year 0 1 2 3 4 5 6 7 8 9 10 11
Country
A 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
B 6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
However, I hoped to receive something along the line of this
Year Country Val1 Val2 Fact
2005 A 1 3 1
2006 A 1.75 3.375 2
2007 A 2.5 3.75 2
2008 A 2.125 3.5625 1
2009 A 2.125 3.5625 1
2010 A 2.125 3.5625 1
2005 B 5 7 2
2006 B 5.75 6.5 2
2007 B 5.75 6.5 1
2008 B 7.25 5.5 2
2009 B 7.25 5.5 2
2010 B 7.25 5.5 2
I would be very grateful if anybody could point me towards a solution for this problem.