I have a time dependent variable represented as two vectors: a vector of times (sorted) and a vector of values at those times. I want to resample this variable at different times specified by a different sorted vector of times.
In another language I would do a simultaneous walk through the two sorted time vectors. i.e. linear search from the beginning of the old time vector until I find the time closest to the first element in the new time vector, then continuing on from that point in the old vector to find the time closest to the second element in the new vector, etc. This gives a solution that is O(n).
The key here is that the two vectors of times are not the same length and the elements are not paired one-to-one so something like map2 or walk2 isn't what I want.
I can implement the simultaneous walk with a for loop (see code below), and it works, but it's slow. I also have another solution that is more R codey, but it's O(n^2) so it also winds up being slow. Is there an R way of doing this that uses internal R implementations to wind up with an O(n) solution?
Alternatively, is there an R function that could replace my get_closest() with a binary search so at least it would be O(nlogn)?
From my searching I suspect the answer is going to be, "write a C function that you call from R", but I'm fairly new to R so I wanted to check that I'm not missing something.
EDIT:
I should have made clear that the values in new_times may not exist in old_times. I want to find the index in old_times where the time is closest to each entry in new_times. In my real application I will then do linear interpolation, but this question is just about searching for the nearest neighbor.
library(tidyverse)
# input values given
old_times <- c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
old_values <- c(3, 7, 6, 7, 8, 9, 7, 6, 4, 6)
new_times <- c(4.1, 9.6, 12.3, 17.8)
Desired output is
new_values <- c(7, 8, 9, 4)
My attempt
new_values <- rep(NA, length(new_times))
old_index <- 1
for (new_index in 1:length(new_times)) {
while (old_index < length(old_times) &&
old_times[old_index] < new_times[new_index]) {
old_index <- old_index + 1
}
# I could now do interpolation if the value of new_times is in between
# two values in old_times. The key is I have a correspondence that
# new_times[new_index] is close in time to old_times[old_index].
new_values[new_index] <- old_values[old_index]
}
# Here's an alternative way to do it that uses more R internals,
# but winds up being O(n^2).
# Get the index in old_times closest to new_time.
# This is O(n).
get_closest <- function(new_time, old_times) {
return(which.min(abs(new_time - old_times)))
}
# Call get_closest on each element of new_times.
# This is O(n^2).
new_indices <- unlist(map(new_times, get_closest, old_times))
# Slice the list of old values to get new values.
new_values2 <- old_values[new_indices]