1

I am trying to write a function which chooses integers randomly from arbitrarily nested lists while maintaining order and the structure of the lists (though empty lists can be ignored). The number of samples should be uniformly random from 0 to the number of integers in the nested list.

For example, running listrand([1,2,3,[3,4],[65,[3]]]) 3 times might give:

[1, 3, [3], [65, [3]]]
[1, 2, [[3]]]
[[3, 4], [65]]

The catch is I need it to be uniformly distributed, so I can't use something like

sample = [[random.sample(mylist)] for mylist in listoflists]

because that would be binomial.

At a minimum I need this to work for single-level nesting. I thought about sampling from a flattened list but then I'm not sure how to use those to construct the desired output.

  • How important is it that the first output be written as `[1, 3, [3], [65, [3]]]` instead of `[1, 3, [3], [65], [[3]]]`? Also, if both the 3 and the 4 were to be sampled, could we write `[1, 3, [3], [4], [65]]`? – chthonicdaemon Jan 24 '18 at 03:38
  • It's important to keep the sampled items within their original lists. So if `65` is in `list3` and `[3]` is in `list3` then the output should have `[65, [3]]` in the location of `list3` – base12masterrace Jan 24 '18 at 03:42
  • OK. Another question: your outputs have different lengths. Does the function choose each value with a certain probability or does it choose a certain number of values with equal probability? – chthonicdaemon Jan 24 '18 at 03:44
  • Sorry, forgot to mention that. The number of samples should be uniformly random from 0 to the number of integers in the nested list. – base12masterrace Jan 24 '18 at 03:47
  • Are you missing a set of outer brackets? Is the list really `[[1, 3, [3], [65, [3]]], [1, 2, [[3]]], [[3, 4], [65]]]`? Are you wanting a sample from the flattened list or one element of the top-most list of lists and integers? – dawg Jan 24 '18 at 03:57
  • I just saw your update. Are you saying that `k` in `random.sample(x, k)` should be `random.randint(0, len(x))`? – James Jan 24 '18 at 04:03
  • @dawg those are three separate outputs from each time the function is run. The function should sample from the flattened list, ie. integers, but have output that maintains the structure. – base12masterrace Jan 24 '18 at 04:04
  • @James yes, exactly, where len(x) is the length of the flattened list. – base12masterrace Jan 24 '18 at 04:06

1 Answers1

3

This solution satisfies your requirements by construction. In other words, the number of elements chosen is uniformly random.

import random
from collections import Iterable

def count(l, r=0):
    for i in l:
        if isinstance(i, Iterable):
            r += count(i)
        else:
            r += 1
    return r

def listrand(target):
    N = count(target)
    nchosen = random.randint(0, N-1)
    chosen = set(random.sample(range(N), nchosen))

    def build(l, c=0):
        output = []
        for i in l:
            if isinstance(i, Iterable):
                c, sublist = build(i, c)
                if sublist:
                    output.append(sublist)
            else:
                if c in chosen:
                    output.append(i)
                c += 1
        return c, output

    return build(target)[1]        

Sample output:

target = [1,2,3,[3,4], [65,[3]]]

for i in range(5):
    print(listrand(target))

[1, 2, 3, [3, 4], [[3]]]
[2]
[]
[2, 3, [3, 4]]
[[4]]
chthonicdaemon
  • 19,180
  • 2
  • 52
  • 66