Challenge here)
THE STORY
I have a big sequence of objects:
OBJS = [o_1, o_2, ..., o_n]
Each object can be recalculated (which is expencive). During recalculation it may add and remove items from the sequence:
class Obj:
def recalculate(self):
# some expensive calcs here
...
# may add objects
if create_new_obj:
OBJS.append(Obj())
# may remove objects
if delete_obj:
del OBJS[idx]
And I have a loop recalculating them, which I want to iterate as fast as possible:
while True:
for obj in OBJS:
obj.recalculate()
What I can do is to recalculate not all of them each iteration. I can add a probability
attribute to an Obj
class or add probabilities to a sequence like this:
OBJS = [
[o_1, 0.0001], # recalculate once per 10 000 iterations in average
[o_2, 1.0], # recalculate each iteration
...,
[o_n, 0.5] # recalculate once per 2 iterations in average
]
Create a generator which returns subset of objects to be recalculated this iteration:
def pick_subset_of_randoms(sequence):
for obj, probability in sequence:
if random.random() <= probability:
yield obj
And update the loop like this:
while True:
for obj in pick_subset_of_randoms(OBJS):
obj.recalculate()
THE PROBLEM
Is there any chance of optimizing pick_subset_of_randoms
generator?
The perfect variant would be to avoid for
loop iterating through all the sequence. Because subset length may be tens or hundreds thousands of times less than the sequence length.
Third-party packages (say numpy
) are allowed. Any suggestions are appreciated!