This will do it:
findNearest3 <- function(x, y, z){
temp <- sort(x[x > z[1] & x < z[2]])
point <- which(abs(temp-y)==min(abs(temp-y)))
return(temp[c(point-1, point, point+1)])
}
The function will look for the nearest value to y
within vector x
, constrained by limits z
, and return this value plus the one before and after within the sorted vector.
Example:
set.seed(123)
df <- data.frame(x = rnorm(100), y = rnorm(100))
sapply(df, findNearest3, .3, c(.2, .4))
x y
[1,] 0.2533185 0.2982276
[2,] 0.3035286 0.3011534
[3,] 0.3317820 0.3104807
Now with
sapply(df, function(x) mean(findNearest3(x, .3, c(.2, .4))))
you'll get the means:
x y
0.2962097 0.3032872
Be aware that this will return NA
if there are not enough values within the given constrains z
:
df <- data.frame(x = c(.1, .23, .35, .5), y = c(.22, .24, .33, .48))
> sapply(df, findNearest3, .3, c(.2, .4))
x y
[1,] 0.23 0.24
[2,] 0.35 0.33
[3,] NA NA
> sapply(df, function(x) mean(findNearest3(x, .3, c(.2, .4)), na.rm = T))
x y
0.290 0.285
Edit: To return the row positions of the values instead of the values themselves, just change the last line of the code:
findNearest3.pos <- function(x, y){
temp <- sort(x)
point <- which(abs(temp-y)==min(abs(temp-y)))
return(c(point-1, point, point+1))
}
Application:
To use it on another dataframe of the same dimensions, first save the positions in a list:
myrows <- lapply(df, findNearest3.pos, y = .3)
and then subset the second dataframe:
set.seed(234)
df1 <- data.frame(x = rnorm(100), y = rnorm(100))
newsubset <- mapply(function(x, y) x[y], df1, myrows)
x y
[1,] -0.9581388 2.214151
[2,] 0.6280635 0.455070
[3,] 0.6625872 0.513053
Considering the other dataframe with only one column, you need to decide which column's row position you want to use.
set.seed(345)
df2 <- data.frame(x = rnorm(100))
You could access the row positions found in V1
(or, in this example x
) with:
df2[myrows[[1]],]
[1] 0.2986353 -0.9917691 -0.6510206
and those found in V2
(here named y
) with:
df2[myrows[[2]],]
[1] -0.3148442 -0.2491949 0.6854260