Get the closest number of every element in two R data tables

Question

Let's say I have 2 data tables with arbitrary positions and scores. All positions in each data table are unique. For example:

dt1:

position score
10       6.2
21       4.5
37       3.6

dt2:

position score
8        12.2
32       4.2
45       3.8
52       4.9

And I want to get position in df2 that is the closest to each row in df1. So for example, position 10 in df1 has df2 of position 8 as the closest. In the end the result will look like this:

position.dt1 score.dt1 position.dt2 score.dt2 distance
10           6.2       8            12.2      2
21           4.5       32           4.2       11
37           3.6       32           4.2       5

How to achieve this in R?

what if there were two positions in dt2 with the same distance? — M--, Sep 23 '19 at 21:38
You can join with a `roll="nearest"` in *data.table* - take a look at the examples in `?data.table`. E.g. - https://stackoverflow.com/questions/44169249/r-data-table-roll-nearest-not-actually-nearest or https://stackoverflow.com/questions/47897193/rolling-join-by-in-data-table-r and https://stackoverflow.com/questions/54013468/merging-two-sets-of-data-by-data-table-roll-nearest-function — thelatemail, Sep 23 '19 at 21:44

score 2 · Answer 1 · answered Sep 23 '19 at 21:42

In Base R

inds = sapply(dt1$position, function(x) which.min(abs(x - dt2$position)))
transform(dt1, pos.dt2 = dt2$position[inds],
          score.dt2 = dt2$score[inds],
          distance = abs(position - dt2$position[inds]))
#  position score pos.dt2 score.dt2 distance
#1       10   6.2       8      12.2        2
#2       21   4.5      32       4.2       11
#3       37   3.6      32       4.2        5

Get the closest number of every element in two R data tables

1 Answers1