I have an application which requires to initialize a large number of objects with Python (3.5.2) and encounter some occasional slow-downs.
The slow-down seems to occur on a specific initialization: most of the calls to __init__
last less than 1 ns, but one of them sometimes lasts several dozens of seconds.
I've been able to reproduce this using the following snippet that initializes 500k a simple object.
import cProfile
class A:
def __init__(self):
pass
cProfile.run('[A() for _ in range(500000)]')
I'm running this code in a notebook. Most of the times (9/10), this code outputs the following (normal execution)
500004 function calls in 0.675 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
500000 0.031 0.000 0.031 0.000 <ipython-input-5-634b77609653>:2(__init__)
1 0.627 0.627 0.657 0.657 <string>:1(<listcomp>)
1 0.018 0.018 0.675 0.675 <string>:1(<module>)
1 0.000 0.000 0.675 0.675 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
The other times, it outputs the following (slow execution)
500004 function calls in 40.154 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
500000 0.031 0.000 0.031 0.000 <ipython-input-74-634b77609653>:2(__init__)
1 40.110 40.110 40.140 40.140 <string>:1(<listcomp>)
1 0.014 0.014 40.154 40.154 <string>:1(<module>)
1 0.000 0.000 40.154 40.154 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Using tqdm, the loop seems to get stuck on one iteration. It's important to note that I was able to reproduce this in a notebook with already a lot of memory allocated.
I suspect that it comes from the list of references to the objects used by the garbage collector that might need to be copied from time to time.
What is exactly happening here, and are there any ways to avoid this ?