1

I have three arrays that have the same dimensions but what I need is to replace values in dat1 by the corresponding values from dat2 when only corresponding values in dat3 are higher than 0.2.

data:

dat1 <- array(1:60, c(3,5,4)) 
dat2 <- array(rnorm(60), c(3,5,4)) 
dat3 <- array(rnorm(60), c(3,5,4))
smci
  • 32,567
  • 20
  • 113
  • 146
Barry
  • 739
  • 1
  • 8
  • 29

2 Answers2

4
ifelse(dat3 > 0.2, dat2, dat1)
Aaron left Stack Overflow
  • 36,704
  • 7
  • 77
  • 142
2

Another option, creating an index first and then using it to subset:

idx <- dat3 > 0.2
dat1[idx] <- dat2[idx]

Edit after comment - a small performance comparison:

set.seed(2015)
N <- 1e6
dat1 <- array(N, c(3,5,4)) 
dat2 <- array(rnorm(N), c(3,5,4)) 
dat3 <- array(rnorm(N), c(3,5,4))

library(microbenchmark)
microbenchmark(
  ifelse = dat1 <- ifelse(dat3 > 0.2, dat2, dat1),
  index = {idx <- dat3 > 0.2
           dat1[idx] <- dat2[idx]},
  unit = "relative"
)

Unit: relative
   expr      min       lq   median       uq      max neval
 ifelse 5.131963 6.460236 5.545135 5.467555 33.86001   100
  index 1.000000 1.000000 1.000000 1.000000  1.00000   100

For the sample data, indexing is ~5 times faster than ifelse.

If you have two restrictions for the replacement index just use the following index instead:

idx <- dat3 > 0.2 & dat3 < 0.5
talat
  • 68,970
  • 21
  • 126
  • 157
  • 1
    @Barry, I haven't tested this kind of data, but in many similar/related questions subsetting with index was faster than `ifelse`. – talat Mar 20 '15 at 16:14
  • @Barry, I added a small benchmark comparison. Also see [this question](https://stackoverflow.com/questions/16275149/does-ifelse-really-calculate-both-of-its-vectors-every-time-is-it-slow) explaining why `ifelse` may be slow. – talat Mar 20 '15 at 16:28
  • 1
    @Barry, I assume you don't want to replace NAs. Try this instead: `idx <- dat3 > 0.2 & !is.na(dat3)` to build the index – talat Mar 20 '15 at 16:47