16

I have an array in R, created by a function like this:

A <- array(data=NA, dim=c(2,4,4), dimnames=list(c("x","y"),NULL,NULL))

And I would like to select along one dimension, so for the example above I would have:

A["x",,]
dim(A["x",,])    #[1] 4 4

Is there a way to generalize if I do not know in advance how many dimensions (in addition to the named one I want to select by) my array might have? I would like to write a function that takes input that might formatted as A above, or as:

B <- c(1,2)
names(B) <- c("x", "y")

C <- matrix(1, 2, 2, dimnames=list(c("x","y"),NULL))

Background

The general background is that I am working on an ODE model, so for deSolve's ODE function it must take a single named vector with my current state. For some other functions, like calculating phase-planes/direction fields, it would be more practical to have a higher-dimensional array to apply the differential equation to, and I would like to avoid having many copies of the same function, simply with different numbers of commas after the dimension I want to select.

flodel
  • 87,577
  • 21
  • 185
  • 223
kai
  • 1,970
  • 2
  • 22
  • 30

4 Answers4

11

I spent quite a lot of time figuring out the fastest way to do this for plyr, and the best I could come up with was manually constructing the call to [:

index_array <- function(x, dim, value, drop = FALSE) { 
  # Create list representing arguments supplied to [
  # bquote() creates an object corresponding to a missing argument
  indices <- rep(list(bquote()), length(dim(x)))
  indices[[dim]] <- value

  # Generate the call to [
  call <- as.call(c(
    list(as.name("["), quote(x)),
    indices,
    list(drop = drop)))
  # Print it, just to make it easier to see what's going on
  print(call)

  # Finally, evaluate it
  eval(call)
}

(You can find more information about this technique at https://github.com/hadley/devtools/wiki/Computing-on-the-language)

You can then use it as follows:

A <- array(data=NA, dim=c(2,4,4), dimnames=list(c("x","y"),NULL,NULL))
index_array(A, 2, 2)
index_array(A, 2, 2, drop = TRUE)
index_array(A, 3, 2, drop = TRUE)

It would also generalise in a straightforward way if you want to extract based on more than one dimension, but you'd need to rethink the arguments to the function.

hadley
  • 102,019
  • 32
  • 183
  • 245
  • Also very nice. So I assume `A["x",,]` is faster than `A["x",1:4,1:4]`, is that correct? – flodel Jan 24 '13 at 13:35
  • @flodel IIRC, yes (especially when the arrays get bigger). You could also do do `A["x", T, T]` – hadley Jan 24 '13 at 13:37
  • @flodel `microbenchmark(A["x", ,], A["x", 1:4, 1:4], A["x", T, T])` - so not much in it, but I probably cared because `*aply` might be calling it millions of times. – hadley Jan 24 '13 at 13:38
  • All good solutions, this one wins by a bit benchmarking on my data, and is nicely commented. Thanks! – kai Jan 24 '13 at 22:35
  • Not strictly necessary for my case, but is it somehow possible to extract the indices earlier? This seems it could be very useful later. So I might be able to call: `A[index_array(A,1,'x')] <- 2` to set values. I hope this explains what I want... – kai Jan 25 '13 at 09:43
  • @kai not with this approach, and in general it will be very difficult – hadley Jan 25 '13 at 12:36
  • I meet with the same problem, and your answer is so brilliant! Thanks! – 163 Jun 30 '15 at 08:52
  • Thank you for your efforts. It seems such a simple problem for there not to be a built-in solution for. – CJB Sep 08 '15 at 15:59
  • I think if you change the line "`indices[[dim]] <- value`" to "`indices[dim] <- value`" then `dim` and `value` can be vectors of length > 1, allowing indexing by multiple dimensions. – CJB Sep 08 '15 at 16:26
5

I wrote this general function. Not necessarily super fast but a nice application for arrayInd and matrix indexing:

extract <- function(A, .dim, .value) {

    val.idx  <- match(.value, dimnames(A)[[.dim]])
    all.idx  <- arrayInd(seq_along(A), dim(A))
    keep.idx <- all.idx[all.idx[, .dim] == val.idx, , drop = FALSE]
    array(A[keep.idx], dim = dim(A)[-.dim], dimnames = dimnames(A)[-.dim])

}

Example:

A <- array(data=1:32, dim=c(2,4,4),
           dimnames=list(c("x","y"), LETTERS[1:4], letters[1:4]))

extract(A, 1, "x")
extract(A, 2, "D")
extract(A, 3, "b")
flodel
  • 87,577
  • 21
  • 185
  • 223
3

The abind package has a function, asub, to do this in addition to other very useful array manipulation functions:

library(abind)
A <- array(data=1:32, dim=c(2,4,4),
           dimnames=list(c("x","y"), LETTERS[1:4], letters[1:4]))

asub(A, 'x', 1)
asub(A, 'D', 2)
asub(A, 'b', 3)

And it allows indexing in multiple dimensions:

asub(A, list('x', c('C', 'D')), c(1,2))
oropendola
  • 1,081
  • 7
  • 8
2

Perhaps there is an easier way, but this works:

do.call("[",c(list(A,"x"),lapply(dim(A)[-1],seq)))
     [,1] [,2] [,3] [,4]
[1,]   NA   NA   NA   NA
[2,]   NA   NA   NA   NA
[3,]   NA   NA   NA   NA
[4,]   NA   NA   NA   NA

Let's generalize it into a function that can extract from any dimension, not necessarily the first one:

extract <- function(A, .dim, .value) {
    idx.list <- lapply(dim(A), seq_len)
    idx.list[[.dim]] <- .value
    do.call(`[`, c(list(A), idx.list))
}

Example:

A <- array(data=1:32, dim=c(2,4,4),
           dimnames=list(c("x","y"), LETTERS[1:4], letters[1:4]))

extract(A, 1, "x")
extract(A, 2, "D")
extract(A, 3, "b")
flodel
  • 87,577
  • 21
  • 185
  • 223
James
  • 65,548
  • 14
  • 155
  • 193