2

How can a complex (i.e., not an equi-join) be done in R?

For example, suppose we have the following data:

B <- data.frame(m = 1:100, x = rnorm(100))
A <- data.frame(id = 1:200, m0 = sample(B$m, 200, replace = TRUE))
A$m1 <- A$m0 + sample(1:20, nrow(A), replace = TRUE)

Then in SQL one could do something like this:

SELECT A.id, SUM(B.x)
FROM A
JOIN B ON B.m BETWEEN A.m0 AND A.m1
GROUP BY A.id

How can one do something analogous in R (aside from using a loop)?

banbh
  • 1,331
  • 1
  • 13
  • 31
  • 2
    You are probably looking for something like [this](http://stackoverflow.com/questions/3916195/finding-overlap-in-ranges-with-r) (an overlap join). – David Arenburg Jan 20 '16 at 16:36
  • Gabor's answer http://stackoverflow.com/a/24480301/239838 (as well as other answers in that question) provide a number of approaches. – banbh Jan 20 '16 at 17:20

1 Answers1

2

One approach is to use sqldf:

# Using data defined in question
library('sqldf')
R <- sqldf('select A.id, sum(B.x) s from A join B on B.m between A.m0 and A.m1 group by A.id')
banbh
  • 1,331
  • 1
  • 13
  • 31