1

I have the dataframe below

df = data.frame(season = rep(seq(1,4),2)
                ,product = c(rep('A', 4), rep('B', 4))
                ,revenue = 1:8
                )

I am looking to calculate each season's revenue as a % of total (inside each product's partition) such that the end table has the following column created

df$pc = c(0.1, 0.2, 0.3, 0.4, 0.19, 0.23, 0.27, 0.31)

I am aware this is achievable with packages such as dplyr as discussed here: Summarizing by subgroup percentage in R However, the challenge is to achieve this with base R functions or a combination of base R and user defined functions.

Any help would be much appreciated.

Sweepy Dodo
  • 1,761
  • 9
  • 15
  • 2
    Use `ave` ; `ave(df$revenue, df$product, FUN = function(x) x/sum(x))` – Ronak Shah Nov 30 '18 at 14:50
  • @ Ronak Shah @akrun Thank you both for your prompt contributions. Both solutions worked. I have now applied it to multiple variable partitions by with(df, revenue/ave(revenue, list(season, new_variable), FUN = sum)) I would have done so in sqldf but your solutions are more compact. Thank you. – Sweepy Dodo Nov 30 '18 at 17:11

1 Answers1

1

We can do a group by division

library(dplyr)
df %>%
  group_by(product) %>% 
  mutate(pc = round(revenue/sum(revenue), 2))

If we need base R, use ave

df$pc <- with(df, revenue/ave(revenue, product, FUN = sum))
akrun
  • 874,273
  • 37
  • 540
  • 662