0

Hi R and {pivottabler} community,

I'm trying to replicate a simple pivot table from Power BI in R, however I found it's kind of tricky to derive the Month-over-Month Totals properly in R using pivottabler.

Here is the simple context:

here is my raw data

 df <-data.frame(month = c(1,1,2),
                 name = c("x1","x2","x1"),
                 rate = c(0.63, 0.76,0.51))
df

  month name score
1     1   x1 0.63
2     1   x2 0.76
3     2   x1 0.51

Now I'm in month 2 and I need to calculate the Month-over-Month change - I need to generate the lagged score variable. In Power BI, it would give something below (expected result)

   name score score_lagged score_MoM_pts
1    x1  0.51         0.63         -0.12
2 Total  0.51         1.39         -1.24

However, when I use pivottabler in R to do this "group by sum" the Total score_lagged is only showing 0.63, which makes sense because x2 is not in month 2 thus the 0.76 under x2 month 1 won't be counted in.

In R, I'm using the trick to mutate( month = month+1 ) to generate the lagged variable (my real dataset is much more complex thus lag() approach will not be accurate), that is, to full join the two following tables on c("name", "month")

  month name score
1     1   x1  0.63
2     1   x2  0.76
3     2   x1  0.51
  month name score_lagged
1     2   x1         0.63
2     2   x2         0.76
3     3   x1         0.51

and below is the joined tables:

 df_final <- df1 %>% 
   full_join(df2, by = c("month","name"))
  month name score score_lagged
1     1   x1  0.63           NA
2     1   x2  0.76           NA
3     2   x1  0.51         0.63
4     2   x2    NA         0.76
5     3   x1    NA         0.51

I further filter to month =2 since I'm interested in month = 2 only

 df_final <- df_final %>% 
   filter(month ==2)
  month name score score_lagged
1     2   x1  0.51         0.63
2     2   x2    NA         0.76

Lastly, I'm feeding pivottabler with the table above

 pt <- PivotTable$new()
 pt$addData(df_final)
 pt$addRowDataGroups("name")
 pt$defineCalculation(calculationName="score", summariseExpression="sum(score)", format = list(big.mark=","))
 pt$defineCalculation(calculationName="score_lagged", summariseExpression="sum(score_lagged)", format = list(big.mark=","))
 pt$defineCalculation(calculationName="score_MoM_pts", summariseExpression="sum(score)-sum(score_lagged)", format = list(big.mark=","))
 pt$renderPivot()

which gives me below

  name  sum(score) sum(score_lagged) score_MoM_pts
  <chr> <dbl>        <dbl>         <dbl>
1 x1     0.51          0.63         -0.12
2 x2       NA          0.76            NA
3 Total    NA          1.39            NA

This is not the expected result shown below

   name score score_lagged score_MoM_pts
1    x1  0.51         0.63         -0.12
2 Total  0.51         1.39         -1.24

the main dilemma is that x2 itself should not be in the result, however if I remove x2 in the input table, the total score_lagged would not add up to 1.39

Please kindly point me to the right direction, and thanks in advance.

0 Answers0