I need to take a massive list of lists and remove lists that are "unfit".
When using Pool.apply_async
, task manager claims to be using only around 10% cpu and 97% memory and the whole process takes forever.
I am not very knowledgeable on this, but if I am using all my cores, I feel as though it should be using more than 10% cpu.
So my questions are as follows:
- Is
Pool.apply_sync
the best way to accomplish my goal? I feel like going back to the main process each time I want to remove an item via the callback is adding too much time/overhead. - What is causing the extreme use of memory?
Here is an example of my code using a smaller list to demonstrate
w_list = [[1, 0, 1], [1, 1, 0], [1, 1, 1]]
budget = 299
cost = [100, 100, 100]
def cost_interior(w):
total_cost = 0
for item in range(0, len(w)):
if w[item] == 1:
total_cost = total_cost + cost[item]
if total_cost > budget or total_cost < (0.5 * budget):
w_list.remove(w)
def remove_unfit(unfit):
if unfit is not None:
w_list.remove(unfit)
if __name__ == "__main__":
p = Pool(2)
for w in w_list:
p.apply_async(cost_interior, args=(w,), callback=remove_unfit)
p.close()
p.join()
print(w_list)