1

I have a list of items v1, v2, etc. Important: its length is NOT known in advance!!

Each item (v1, v2, etc.) shows possible levels: 1:2, 1:3 etc. I need to create a data frame with all possible combinations of levels for items v1, v2, etc. I could do it quite effectively using 'expand.grid':

mylist <- list(v1 = 1:3, v2 = 1:3, v3 = 1:3)
combos <- expand.grid(mylist, KEEP.OUT.ATTRS = FALSE)

But here come complications. Some levels of some items are prohibited from appearing together, e.g. v2 == 1 and v3 == 1 cannot be combined. Below is how I would define such prohibitions (two prohibitions here):

prohibitions = data.frame(item1 = c("v1", "v2"), Level1 = c(1,2),
                          item2 = c("v3", "v3"), Level2 = c(1,3),
                          stringsAsFactors = FALSE)
prohibitions

Of course, I could take my result of expand.grid ('combos') with all possible combinations of item levels, and then remove rows that contain prohibitions:

for(row in 1:nrow(prohibitions)){
  item1 <- prohibitions[row, 'item1']
  item1_level <- prohibitions[row, 'Level1']
  item2 <- prohibitions[row, 'item2']
  item2_level <- prohibitions[row, 'Level2']
  # Removing rows that contain prohibited combinations:
  combos <- combos[!(combos[[item1]] == item1_level &
                       combos[[item2]] == item2_level), ]
}

However, I am not sure it is an effective way of doing it. Mainly because when 'mylist' is long and some attributes have a lot of levels, 'combos' will become super-huge. Thus, I thought it might be better to build 'combos' 'on the fly' - WHILE TAKING INTO ACCOUNT THE PROHIBITIONS. But then, it seems, like I would have to build a very long loop through all the items. Problems with that:

  1. I don't know how to write a loop through a bunch of items (v1, v2, etc.) - when I don't know in advance how many of them are there.
  2. Loops in R are slow.

Or maybe there is a way in R to build iterators or stacks like in Python so that I could build my 'combos' as a stack and then evaluate each row one at a time?

Any advice or is the solution I proposed above the only reasonable one? Thank you very much!

user3245256
  • 1,842
  • 4
  • 24
  • 51
  • How many combinations and prohibitions are you actually working with. This example doesn't make it clear what the technical challenge is or really that's the rule that's driving the prohibitions. R does not have iterators like Python. It would be nice if your [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) had more realistic data and perhaps timings so we can see what your run time is and what you need it to be. – MrFlick Jan 12 '18 at 17:11
  • Fair point. I am looking at a realistic (smallish) example with the following specs for 'mylist': mylist <- list(v1 = 1:5, v2 = 1:5, v3 = 1:5,v4 = 1:6, v5 = 1:5, v6 = 1:7,v7 = 1:2, v8 = 1:2, v9 = 1:2,v10 = 1:2, v11 = 1:2, v12 = 1:5) and the following prohibitions as a rather small example: prohibitions = data.frame(item1 = c("v1", "v2", "v5", "v8", "v8"), Level1 = c(1, 2, 5, 2, 2), item2 = c("v3", "v3", "v7", "v12", "v12"), Level2 = c(1, 3, 2, 4, 5), stringsAsFactors = FALSE) – user3245256 Jan 12 '18 at 18:13

1 Answers1

0

Have you tried the wildcard package? It's like expand.grid(), but a bit more flexible.

landau
  • 5,636
  • 1
  • 22
  • 50