-1

in my data

data=structure(list(v1 = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), 
    v2 = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), x = c(10L, 
    1L, 2L, 3L, 4L, 3L, 2L, 30L, 3L, 5L)), .Names = c("v1", "v2", 
"x"), class = "data.frame", row.names = c(NA, -10L))

There are 3 variables. I need to get only those lines in relation to which X, has the max value. For example. Take First category of v1 and look in relation to which category v2 x has max value It is

v1=1 and v2=1 x=10

Take second category of v1 and look in relation to which category v2 x has max value It is v1=2 ,v2=3 x=30

so desired output

v1  v2  x
1   1   10
2   3   30

How to do it?

psysky
  • 3,037
  • 5
  • 28
  • 64

2 Answers2

1

Here is a solution using data.table:

library(data.table)
setDT(data)
data[, .SD[which.max(x)], keyby = v1]

   v1 v2  x
1:  1  1 10
2:  2  3 30

And for completeness an ugly base-R solution:

t(sapply(split(data, data[["v1"]]), function(s) s[which.max(s[["x"]]),]))
  v1 v2 x 
1 1  1  10
2 2  3  30
s_baldur
  • 29,441
  • 4
  • 36
  • 69
1

Using dplyr:

data %>%
  group_by(v1) %>%
  filter(x == max(x))

# A tibble: 2 x 3
# Groups:   v1 [2]
     v1    v2     x
  <int> <int> <int>
1     1     1    10
2     2     3    30
tmfmnk
  • 38,881
  • 4
  • 47
  • 67