Subsetting Elements in a "Hypothetical" List

Question

I found this R function (Algorithm to calculate power set (all possible subsets) of a set in R) that can return the "power set" for a set of letters. Below I assign it to f() then use it to return a power set in a list.

f <- function(set) { 
  n <- length(set)
  masks <- 2^(1:n-1)
  lapply( 1:2^n-1, function(u) set[ bitwAnd(u, masks) != 0 ] )
}

results = f(LETTERS[1:3])

[[1]]
character(0)

[[2]]
[1] "A"

[[3]]
[1] "B"

[[4]]
[1] "A" "B"

[[5]]
[1] "C"

[[6]]
[1] "A" "C"

[[7]]
[1] "B" "C"

[[8]]
[1] "A" "B" "C"

I could then draw some random sequence of letters from the power set object using an index value:

> random = sample.int(length(results), 1)
[1] 6

> results[random]
[[1]]
[1] "A" "C"

Suppose now I want to make a list of the power set for all 26 letters. Since this list would have 2^26 = 67108864 elements, it would be too large to store in memory:

# too big
big_results = f(LETTERS[1:26])

But suppose I were to instead generate some random number I know is an index of the big_results results:

big_random = sample.int(67108864, 1)

13626980

Is there some way of knowing which permutation of letters the "big_random" would correspond to on the "big_results" list - without actually fully running "big_results"?

For example:

# too big to run (hypothetical list)
big_results[big_random]

Could some enumeration or recursion formula be used to figure out some pattern and then somehow determine that "13626980" corresponds to the letters "P H R T L D U Z" for instance?

@ Donald Seinen: Thank you so much - although I don't fully understand the code you have provided, this looks really interesting! Just a question, suppose I have some arbitrary list, e.g. my_list = c("red", "blue", "green", "yellow", "orange", "purple", "pink") ..... could your code still work? — stats_noob, Jul 23 '22 at 04:42
You could rewrite the function as a `while` loop instead of using `apply`. It would then create every element of the power list in sequence until the xth (i.e. 13626980th) element is produce and the loop is exited.. This will likely be pretty slow, but you wouldn't need to actually save anything except for the 13626980th element in memory and likely wouldn't need to iterate through the entire set. — socialscientist, Jul 23 '22 at 04:46
@ rawr: thank you for your comment! this looks interesting - what exactly is the online encyclopedia of integer sequences? why is this useful? thank you so much! — stats_noob, Jul 23 '22 at 04:47
@ socialscientist: thank you for your comment! I feel this approach might take a very long time if I understood this correctly! lol! — stats_noob, Jul 23 '22 at 04:48
@antonoyaro8 that is essentially what the algorithm that was linked would do - note the looping. — socialscientist, Jul 23 '22 at 04:49
@antonoyaro8 Instead of `lapply` which calculates all, just compute for the single element you are interested in. To generalize it a bit, `g <- function(set, n) set[bitwAnd(n-1, 2^(1:length(set)-1)) != 0]`, then you can compare it to the output of `f(my_list)[[n]])` for any n — Donald Seinen, Jul 23 '22 at 05:03
Thank you everyone for your kind comments and answers - I really appreciate it! — stats_noob, Jul 23 '22 at 19:50
I posted a related question here: https://stackoverflow.com/questions/73087875/subsetting-elements-in-a-hypothetical-list ... does anyone have any ideas about this one? thanks! — stats_noob, Jul 25 '22 at 02:19

Subsetting Elements in a "Hypothetical" List

0 Answers0