I have a dataframe like this one
df = pd.DataFrame({'week': [1, 1, 1, 2, 2, 2, 3, 3, 3],
'p': list(range(1, 4)) * 3,
'q': [4, 2, 1, 6, 2, 1, 6, 3, 2]})
df
week p q
0 1 1 4
1 1 2 2
2 1 3 1
3 2 1 6
4 2 2 2
5 2 3 1
6 3 1 6
7 3 2 3
8 3 3 2
In this dataframe, p identifies a product (1, 2, and 3) and q is the quantity sold in a week. I need to create two variables with within week information. The first one should have the ratio of q relative to that of the product with p=3. The second one should have the ratio relative to product 2 (for product 1) and relative to product 3 (for product 2). The desired output should be:
week p q d1 d2
0 1 1 4 4.0 2.0
1 1 2 2 2.0 2.0
2 1 3 1 1.0 1.0
3 2 1 6 6.0 3.0
4 2 2 2 2.0 2.0
5 2 3 1 1.0 1.0
6 3 1 6 3.0 2.0
7 3 2 3 1.5 1.5
8 3 3 2 1.0 1.0
This answer is related and it allows me to create the first variable (I would need to replace the NaN with ones for product p = 3).
df['d1']=np.nan
df['d1']=df.loc[df.p!=3,'d1'].\
fillna(df.q/df.groupby('week').q.transform('last'))
df
week p q d1
0 1 1 4 4.0
1 1 2 2 2.0
2 1 3 1 NaN
3 2 1 6 6.0
4 2 2 2 2.0
5 2 3 1 NaN
6 3 1 6 3.0
7 3 2 3 1.5
8 3 3 2 NaN
Two things remain that I haven't been able to figure out. First, if instead of defining the first variable relative to p=3, I were to define it relative to p=2, how would I modify the code above? (or what alternative would be better?)
Second, how can I generate the second variable? (that has the value of q relative to that of p=2 for p=1, and relative to that of p=3 for p=2).