Effectively choose subset where each item has probability to be picked

Question

Challenge here)

THE STORY

I have a big sequence of objects:

OBJS = [o_1, o_2, ..., o_n]

Each object can be recalculated (which is expencive). During recalculation it may add and remove items from the sequence:

class Obj:
    def recalculate(self):
        # some expensive calcs here
        ...

        # may add objects
        if create_new_obj:
            OBJS.append(Obj())

        # may remove objects
        if delete_obj:
            del OBJS[idx]

And I have a loop recalculating them, which I want to iterate as fast as possible:

while True:
    for obj in OBJS:
        obj.recalculate()

What I can do is to recalculate not all of them each iteration. I can add a probability attribute to an Obj class or add probabilities to a sequence like this:

OBJS = [
    [o_1, 0.0001],  # recalculate once per 10 000 iterations in average
    [o_2, 1.0],  # recalculate each iteration
    ...,
    [o_n, 0.5]  # recalculate once per 2 iterations in average
]

Create a generator which returns subset of objects to be recalculated this iteration:

def pick_subset_of_randoms(sequence):
    for obj, probability in sequence:
        if random.random() <= probability:
            yield obj

And update the loop like this:

while True:
    for obj in pick_subset_of_randoms(OBJS):
        obj.recalculate()

THE PROBLEM

Is there any chance of optimizing pick_subset_of_randoms generator?

The perfect variant would be to avoid for loop iterating through all the sequence. Because subset length may be tens or hundreds thousands of times less than the sequence length.

Third-party packages (say numpy) are allowed. Any suggestions are appreciated!

Does this answer your question? [C++ random non-repeated integers with weights](https://stackoverflow.com/questions/57599509/c-random-non-repeated-integers-with-weights) — Peter O., Nov 03 '19 at 13:25
No, it generates random sample of given size. But I have no exact size of resulting sample, I want it to generate random samples, each of different size, but with a given probability for each item to appear in it. — Dan Zaikin, Nov 03 '19 at 16:48

Effectively choose subset where each item has probability to be picked

0 Answers0