1

Lets say we have a 3d array:

my.array <- array(1:27, dim=c(3,3,3))

I would like to create a list of the n first neighbors.

Example: Lets get my.array[2,2,2]=14, so the first neighbors of 14 is:

list[14] = [1 to 27] - 14

I also would like to do the same for second, third, n closest neighbors using R, C or Matlab.

Thanks

DemetriusRPaula
  • 377
  • 1
  • 10
  • 2
    So the "first neighbors of 14" are a vector ranging from -13 to 13? What are the second neighbors of 14? Third? (I do not understand what you mean or are requesting.) – r2evans Jun 21 '16 at 04:36
  • the first neighbor for (1,1,1) is 2 and so on. the second is 2,3 and so on... – DemetriusRPaula Jun 21 '16 at 11:38
  • 1
    *"2 and so on"* is not at all useful. Are "first level neighbors" those with a distance of 1? I'll assume euclidean distance, not Manhattan or other. If so, then the neighbors of [1,1,1] should be 2, 4, and 10, not including 5, 11, or 13 (dist 1.4). Please clear up your definition. – r2evans Jun 21 '16 at 13:16

2 Answers2

2

Based on the comments, I assume you are defining "first nearest neighbor" as all cells with a euclidean distance of 1 or less (excluding self), "second nearest neighbors" as those with 2 or less, etc. Your assertion in a comment in @evan058's answer that "for (1,1,1) the first level neighbors is 2,4,5,10,11,13", I'm actually interpreting this to include the immediate diagonals (with a distance of 1.414) but not further diagonals (in your example, 14 would be a further diagonal with a distance of 1.732).

This function accepts either a pre-defined array (ary) or the dimensions to make one (dims).

nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 1)
#      dim1 dim2 dim3
# [1,]    2    1    1
# [2,]    1    2    1
# [3,]    1    1    2
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 1,
                 return_indices = FALSE)
# [1]  2  4 10
nearestNeighbors(dims = c(3,3,3), elem = c(1,1,1), dist = 2,
                 return_indices = FALSE)
#  [1]  2  3  4  5  7 10 11 13 14 19

nearestNeighbors(ary = array(27:1, dim = c(3,3,3)), elem = c(1,1,1), dist = 2)
#       dim1 dim2 dim3
#  [1,]    2    1    1
#  [2,]    3    1    1
#  [3,]    1    2    1
#  [4,]    2    2    1
#  [5,]    1    3    1
#  [6,]    1    1    2
#  [7,]    2    1    2
#  [8,]    1    2    2
#  [9,]    2    2    2
# [10,]    1    1    3
nearestNeighbors(ary = array(27:1, dim = c(3,3,3)), elem = c(1,1,1), dist = 2,
                 return_indices = FALSE)
#  [1] 26 25 24 23 21 18 17 15 14  9

The function:

#' Find nearest neighbors.
#'
#' @param ary array
#' @param elem integer vector indicating the indices on array from
#'   which all nearest neighbors will be found; must be the same
#'   length as \code{dims} (or \code{dim(ary)}). Only one of
#'   \code{ary} and \code{dim} needs to be provided.
#' @param dist numeric, the max distance from \code{elem}, not
#'   including the 'self' point.
#' @param dims integer vector indicating the dimensions of the array.
#'   Only one of \code{ary} and \code{dim} needs to be provided.
#' @param return_indices logical, whether to return a matrix of
#'   indices (as many columns as dimensions) or the values from
#'   \code{ary} of the nearest neighbors
#' @return either matrix of indices (one column per dimension) if
#'   \code{return_indices == TRUE}, or the appropriate values in
#'   \code{ary} otherwise.
nearestNeighbors <- function(ary, elem, dist, dims, return_indices = TRUE) {
  if (missing(dims)) dims <- dim(ary)
  tmpary <- array(1:prod(dims), dim = dims)
  if (missing(ary)) ary <- tmpary
  if (length(elem) != length(dims))
      stop("'elem'' needs to have the same dimensions as 'ary'")
  # work on a subset of the whole matrix
  usedims <- mapply(function(el, d) {
    seq(max(1, el - dist), min(d, el + dist))
  }, elem, dims, SIMPLIFY=FALSE)
  df <- as.matrix(do.call('expand.grid', usedims))
  # now, df is only as big as we need to possibly satisfy `dist`
  ndist <- sqrt(apply(df, 1, function(x) sum((x - elem)^2)))
  ret <- df[which(ndist > 0 & ndist <= dist),,drop = FALSE]
  if (return_indices) {
    return(ret)
  } else {
    return(ary[ret])
  }
}

Edit: changed the code for a "slight" speed improvement: using a 256x256x256 array and a distance of 2 previously took ~90 seconds on my machine. Now it takes less than 1 second. Even a distance of 5 (same array) takes less than a second. Not fully tested, please verify it is correct.

Edit: Removed the extra { on the fifty line of the function.

Community
  • 1
  • 1
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • It works, but I am dealing with a 256x256x256, for each element is taking 15 seconds. Need to do the same for 16.777.216 points :( Is there a way to make it faster. I wont need more than 5 for distance. Thanks r2evans. – DemetriusRPaula Jun 21 '16 at 19:53
  • You can do similar things *much* faster with [`Rcpp`](http://rcpp.org/). It'll deal with the dimensionality without problem, and with its ["sugar"](https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-sugar.pdf), much of the code you need is very similar to the code above. – r2evans Jun 21 '16 at 20:02
  • You should be able to speed things up by restricting yourself to a submatrix of the whole. That is, you *know* that you can limit yourself to +/- `dist` cells in each direction, so you can reduce the 256x256x256 array to a 11x11x11 array, calculate the neighbors, then re-offset based on the center cell. Since this reduces the matrix size by a factor of no less than 12,600 (worst case), it should speed up quite a bit. – r2evans Jun 21 '16 at 20:05
  • You can speed it up just a little more if you (a) hard-code which of `ary` or `dims` you will be providing all the time, and (b) you take out the length check. This doesn't save you much, but if you're dealing with `256^3` checks, any microsecond may help. – r2evans Jun 21 '16 at 20:56
  • Good catch (`{`), I was in the middle of improving the readability when "work" caught up to me :-) – r2evans Jun 21 '16 at 21:36
  • Could you please help me on this one? You are the guy! http://stackoverflow.com/questions/38000994/speed-up-loops-and-condition-with-r?noredirect=1#comment63460530_38000994 @r2evans – DemetriusRPaula Jun 25 '16 at 03:17
0

I think something along these lines will do the trick:

nClosest <- function(pts, pt, n)
{
  # Get the target value
  val <- pts[pt[1], pt[2], pt[3]]
  # Turn the matrix into a DF
  ptsDF <- adply(pts, 1:3)
  # Create Dist column for distance to val
  ptsDF$Dist <- abs(ptsDF$V1 - val)
  # Order by the distance to val
  ptsDF <- ptsDF[with(ptsDF, order(Dist)),]
  # Split into groups:
  sp <-  split(ptsDF, ptsDF$Dist)
  # Get max index
  topInd = min(n+1, length(sp))
  # Agg the split dfs into a single df
  rbind.fill(sp[2:topInd])
}

Output:

> nClosest(my.array, c(1,2,2), 3)
  X1 X2 X3 V1 Dist
1  3  1  2 12    1
2  2  2  2 14    1
3  2  1  2 11    2
4  3  2  2 15    2
5  1  1  2 10    3
6  1  3  2 16    3
evan.oman
  • 5,922
  • 22
  • 43