0

I am a bit lost about how to optimize for loops in R. I have a set such that element i belongs to the set iff contains[[i]] == 1. I want to check whether sets of indices are included in this set. Currently I have the following code. Can it be written more efficiently?

contains = c(1, 0, 0, 1, 0, 1, 1, 0)
indices = c(4, 5) # not ok
# indices = c(4, 6) # ok
ok <- TRUE
for (index in indices) {
    if (contains[[index]] == 0) {
        ok <- FALSE
        break
    }
}
if (ok) {
    print("ok")
} else {
    print("not ok")
}
fflorian
  • 19
  • 5
  • Does this answer your question? [Is there an R function for finding the index of an element in a vector?](https://stackoverflow.com/questions/5577727/is-there-an-r-function-for-finding-the-index-of-an-element-in-a-vector) – ekoam Nov 05 '20 at 17:07
  • Instead of a loop, `ok = all(indices %in% which(contains == 1))` or `ok = all(contains[indices] == 1)`? Probably the second would be a little faster. – Gregor Thomas Nov 05 '20 at 17:09
  • No, or I'm missing how. I would like something with complexity O(|indices|) – fflorian Nov 05 '20 at 17:13
  • @GregorThomas Indeed, this seems to work – fflorian Nov 05 '20 at 17:16

1 Answers1

1

I would suggest either of these:

ok = all(indices %in% which(contains == 1))
ok = all(contains[indices] == 1)

They will be faster than a for loop in almost all cases. (Exception: if the vectors involved are very long and there is an early discrepancy, your break will stop searching as soon as a first false is found and probably be faster.)

If you need really fast solutions on biggish data, please share some code to simulate data at scale so we can benchmark on a relevant use case.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294