R: find data frame index of multiple conditions

Question

Given two data frames s and q with five observations each:

set.seed(8)
s <- data.frame(id=sample(c('Z','X'), 5, T),
                t0=sample(1:10, 5, T), 
                t1 = sample(11:30, 5, T))

q <- data.frame(id=sample(c('Z','X'), 5, T),
                 t0=sample(1:10, 5, T), 
                 t1 = sample(11:30, 5, T))


> s
  id t0 t1
1  Z  8 20
2  Z  3 12
3  X 10 19
4  X  8 21
5  Z  7 13

> q
  id t0 t1
1  X  3 30
2  Z  5 12
3  Z  7 23
4  Z  3 21
5  X  7 27

The midpoint for the observations between the variables t0 and t1 is (e.g. for s data):

s$t0+(s$t1-s$t0)/2

To find the index of the (first) observation in s whose midpoint is closest to, say, the first observation in q I can do:

i <- which.min(abs((s$t0+(s$t1-s$t0)/2 - (q$t0[1]+(q$t1[1]-q$t0[1])/2)))
s[i,]

gives:

    id t0 t1
3  X 10 19

But I cannot figure out how to find the same index in the original data s if I also want to condition on the id variable (e.g. pseudo code like: which.min(....) & s$id == q$id[1] - in this case the midpoint is sought among ids being 'X'). This SO is close but not spot on. Again: I need a index to be used in the original 5-row data set.

score 1 · Accepted Answer · answered Sep 03 '16 at 22:26

Set the which.min argument to infinity when your condition is not obeyed:

val <- abs((s$t0+(s$t1-s$t0)/2 - (q$t0[1]+(q$t1[1]-q$t0[1])/2))
val[s$id != q$id[1]] <- Inf
i <- which.min(val)

By the way, you can simplify the expression in the first character as:

val <- abs((s$t0+s$t1)/2-(q$t0[1]+q$t1[1])/2)

or even

val <- abs(s$t0+s$t1-q$t0[1]-q$t1[1])/2

R: find data frame index of multiple conditions

1 Answers1