7

I have a data frame with the following data:

z = data.frame(date = strptime(c(20110101,20110102,20110103,20110104,20110105,20110106),
                               format = '%Y%m%d'),
               rate1=c(1,2,3,4,5,6),
               rate2=c(2,1,3,6,8,4),
               rate3=c(4,1,3,6,8,3),
               rate4=c(7,8,9,2,1,8))

Use pmax to get the maximum value of the 'rate columns' for each row:

z$max = pmax(rate1,rate2,rate3,rate4)

#         date rate1 rate2 rate3 rate4 max
# 1 2011-01-01     1     2     4     7   7
# 2 2011-01-02     2     1     1     8   8
# 3 2011-01-03     3     3     3     9   9
# 4 2011-01-04     4     6     6     2   6
# 5 2011-01-05     5     8     8     1   8
# 6 2011-01-06     6     4     3     8   8

The pmax function allows me to get the maximum value for each row, but I was wondering how I can get the index of the maximum value for that record.

Where z$max equal the maximum values c(7, 8, 9, 6, 8, 8), I would like to get the corresponding column indices c(5, 5, 5, 3, 3, 5)

Is this possible? I know this seems like something simple but I cannot find the answer anywhere.

Henrik
  • 65,555
  • 14
  • 143
  • 159
thequerist
  • 1,774
  • 3
  • 19
  • 27

2 Answers2

16

You could use max.col to get the column index corresponding to the maximum value:

z$max_ci = max.col(z[2:5]) + 1
z
        date rate1 rate2 rate3 rate4 max_ci
1 2011-01-01     1     2     4     7      5
2 2011-01-02     2     1     1     8      5
3 2011-01-03     3     3     3     9      5
4 2011-01-04     4     6     6     2      3
5 2011-01-05     5     8     8     1      3
6 2011-01-06     6     4     3     8      5

I think you mean you wanted the index but you only use 4 vectors there, so to find what you want, you would have to find the index and then add 1.


Please note the ties.method argument:

a character string specifying how ties are handled, "random" by default. If ties.method = "first", max.col returns the column number of the first of several maxima in every row. [...] Correspondingly, ties.method = "last" returns the last of possibly several indices

Henrik
  • 65,555
  • 14
  • 143
  • 159
Galled
  • 4,146
  • 2
  • 28
  • 41
3

Very simple in base R:

z$wmax <- apply(z[, -c(1,6)],1, which.max)

Actually that gives you 1 less than what you were asking for since I excluded the first column but that can easily be remedied by adding one.

z$max_col_n <- apply(z[, -c(1,6)],1, which.max) +1
IRTFM
  • 258,963
  • 21
  • 364
  • 487