0

I have this df

df <- data.frame('ID' = c(1,1,1,1,1,3,4,4,4,5),
                 'YEAR' = c(2002,2002,2002,2003,2003,2005,2010,2010,2010,2008),
                 'WAGES' = c(100,98,60,120,80,300,50,40,30,500));

And I want to add a column to the df which has the WAGES as percentages of the max wage for each unique pair of YEAR and ID. Thus, the result should be like this:

R> df
   ID YEAR WAGES  PRC
1   1 2002   100 1.00
2   1 2002    98 0.98
3   1 2002    60 0.60
4   1 2003   120 1.00
5   1 2003    80 0.67
6   3 2005   300 1.00
7   4 2010    50 1.00
8   4 2010    40 0.80
9   4 2010    30 0.60
10  5 2008   500 1.00
s_baldur
  • 29,441
  • 4
  • 36
  • 69
H.Dave
  • 37
  • 5

2 Answers2

2

We can use a data.table approach:

library(data.table)
setDT(df)
df[, PRC := WAGES / max(WAGES), by = .(YEAR, ID)]
df
    ID YEAR WAGES       PRC
 1:  1 2002   100 1.0000000
 2:  1 2002    98 0.9800000
 3:  1 2002    60 0.6000000
 4:  1 2003   120 1.0000000
 5:  1 2003    80 0.6666667
 6:  3 2005   300 1.0000000
 7:  4 2010    50 1.0000000
 8:  4 2010    40 0.8000000
 9:  4 2010    30 0.6000000
10:  5 2008   500 1.0000000

But if you prefer good old base R, then here is one solution.

mw_yid <- aggregate(WAGES ~ paste(ID, YEAR), df, max)
mw_yid
  paste(ID, YEAR) WAGES
1          1 2002   100
2          1 2003   120
3          3 2005   300
4          4 2010    50
5          5 2008   500
df$PRC <- df$WAGES / mw_yid[[2]][match(paste(df$ID, df$YEAR), mw_yid[[1]])]
df
   ID YEAR WAGES       PRC
1   1 2002   100 1.0000000
2   1 2002    98 0.9800000
3   1 2002    60 0.6000000
4   1 2003   120 1.0000000
5   1 2003    80 0.6666667
6   3 2005   300 1.0000000
7   4 2010    50 1.0000000
8   4 2010    40 0.8000000
9   4 2010    30 0.6000000
10  5 2008   500 1.0000000
s_baldur
  • 29,441
  • 4
  • 36
  • 69
0

We can use a tidyverse approach:

library(tidyverse)

df <- df %>% group_by(YEAR, ID) %>% mutate(PRC = WAGES/max(WAGES))

> df
# A tibble: 10 x 4
# Groups:   YEAR, ID [5]
      ID  YEAR WAGES       PRC
   <dbl> <dbl> <dbl>     <dbl>
 1     1  2002   100 1.0000000
 2     1  2002    98 0.9800000
 3     1  2002    60 0.6000000
 4     1  2003   120 1.0000000
 5     1  2003    80 0.6666667
 6     3  2005   300 1.0000000
 7     4  2010    50 1.0000000
 8     4  2010    40 0.8000000
 9     4  2010    30 0.6000000
10     5  2008   500 1.0000000
LAP
  • 6,605
  • 2
  • 15
  • 28