calculate diff between two values and then % difference associated to unique references month by month in pandas dataframe

Question

I have a pandas dataframe;

	ID	MONTH	TOTAL
0	REF1	1	500
1	REF1	2	501
2	REF1	3	620
3	REF2	8	5001
4	REF2	9	5101
5	REF2	10	5701
6	REF2	11	7501
7	REF2	7	6501
8	REF2	6	1501

I need to do a comparison between of difference between the ID's previous month's TOTAL.

At the moment I can calculate the difference between the row above but the comparison doesn't take into account the ID/MONTH. Would this need to be a where loop?

I have tried the below, but this returns NaN in all cells of the 'Variance' & 'Variance%' columns;

df_all.sort_values(['ID', 'MONTH'], inplace=True)
df_all['Variance'] = df_all['TOTAL'] - df_all.groupby(['ID', 'MONTH'])['TOTAL'].shift()
df_all['Variance%'] = df_all['TOTAL'] - df_all.groupby(['ID', 'MONTH'])['TOTAL'].pct_change()

The desired outcome is;

	ID	MONTH	TOTAL	Variance	Variance %
0	REF1	1	500	0	0
1	REF1	2	501	1	0.2

What is your expected output? – Azhar Khan Nov 25 '22 at 10:55 — Azhar Khan, Nov 25 '22 at 10:55
I have edited the original question to show desired outcome – Ryan1234 Nov 25 '22 at 11:44 — Ryan1234, Nov 25 '22 at 11:44

score 1 · Accepted Answer · answered Nov 25 '22 at 10:55

1

You can shift the Month by adding 1 (eventually use a more complex logic if you have real dates), then perform a self-merge and subtract:

df['diff'] = df['TOTAL'].sub(
               df[['ID', 'MONTH']]
               .merge(df.assign(MONTH=df['MONTH'].add(1)),
                      how='left')['TOTAL']
              )

Output:

     ID  MONTH  TOTAL    diff
0  REF1      1    500     NaN
1  REF1      2    501     1.0
2  REF1      3    620   119.0
3  REF2      8   5001 -1500.0   # 5001 - 6501
4  REF2      9   5101   100.0
5  REF2     10   5701   600.0
6  REF2     11   7501  1800.0
7  REF2      7   6501  5000.0   # 6501 - 1501
8  REF2      6   1501     NaN

answered Nov 25 '22 at 10:55

mozway

194,879
13
39
75

Can you please update the question with the expected output for clarity? – mozway Nov 25 '22 at 11:09
hello Mozway, can you take a look at this question ? I believe there may be a better solution: https://stackoverflow.com/questions/74567984/how-do-i-find-a-continuos-number-in-dataframe-and-apply-to-new-column/74571917#74571917 – Bushmaster Nov 25 '22 at 11:19
thank you for your help - I have updated the original question with the desired outcome. I wish to calculate the variance and then the Variance % difference – Ryan1234 Nov 25 '22 at 11:34

calculate diff between two values and then % difference associated to unique references month by month in pandas dataframe

1 Answers1