Problem: I have 2 datasets with no matching identifiers (like ID) and need to find the closest match in df1$time to df2$tstart. df1 (with time column) has 660,000 rows with time stamps approximately every 0.00125 s. Whatever the closest match is to df2$tstart, I would like a new column made (df1$trial_start) that says "yes", otherwise "no".
I've tried findInterval, but it only seems to match in ascending order, and doesn't check values in both directions. In the below code, it looks good for most of the outputs, but there are some indices where the value after the returned index is closer to $tstart
#my actual code:
index_closest <- findInterval(iti_summaries_2183[["24"]]$tstart, poke_1s$time)
poke_1s$trial_start <- ifelse(seq_len(nrow(poke_1s)) %in% index_closest, "yes", "no")
I've also tried which.min, which doesn't work since the lists lengths don't match. Additionally, I've fought with roll = "nearest" like here but the functions return values and I'm not sure how to create a new column and assign y/n.
Code to replicate problem:
n <- 773
df1 <- structure(list(initiate = sample(c(0,1), replace=TRUE, size=n),
left = sample(c(0,1), replace=TRUE, size=n),
right = sample(c(0,1), replace=TRUE, size=n),
time = seq(from = 2267.2, to = 2363.75, by = 0.125)))
df1 <- data.frame(df1)
df2 <- structure(list(trial = c(156:162),
control = c(0, 0, 0, 0, 3, 0, 3),
t_start = c(2267.231583, 2289.036355, 2298.046849, 2318.933635, 2328.334036, 2347.870449, 2363.748095),
t_end = c(2268.76760, 2290.83370, 2299.38547, 2320.71400, 2329.93985, 2349.15464, 2365.12455)),
class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -7L), spec = structure(list(
cols = list(trial = structure(list(), class = c("collector_double",
"collector")), control = structure(list(), class = c("collector_double",
"collector")), t_start = structure(list(), class = c("collector_double",
"collector")), t_end = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))