With generic data:
set.seed(456)
a <- sample(0:1,50,replace = T)
b <- rnorm(50,15,5)
df1 <- data.frame(a,b)
c <- seq(0.01,0.99,0.01)
d <- rep(NA, 99)
for (i in 1:99) {
d[i] <- 0.5*(10*c[i])^2+5
}
df2 <- data.frame(c,d)
For each df1$b
we want to find the nearest df2$d
.
Then we create a new variable df1$XYZ
that takes the df2$c
value of the nearest df2$d
This question has guided me towards data.table
library. But I am not sure if ddplyr
and group_by
can also be used:
Here was my data.table
attempt:
library(data.table)
dt1 <- data.table( df1 , key = "b" )
dt2 <- data.table( df2 , key = "d" )
dt[ ldt , list( d ) , roll = "nearest" ]