Problem
Let's assume that I want to find n**2
for all numbers smaller than 20000000
.
General setup for all three variants that I test:
import time, psutil, gc
gc.collect()
mem_before = psutil.virtual_memory()[3]
time1 = time.time()
# (comprehension, generator, function)-code comes here
time2 = time.time()
mem_after = psutil.virtual_memory()[3]
print "Used Mem = ", (mem_after - mem_before)/(1024**2) # convert Byte to Megabyte
print "Calculation time = ", time2 - time1
Three options to calculate these numbers:
1. Creating a list of via comprehension:
x = [i**2 for i in range(20000000)]
It is really slow and time consuming:
Used Mem = 1270 # Megabytes
Calculation time = 33.9309999943 # Seconds
2. Creating a generator using '()'
:
x = (i**2 for i in range(20000000))
It is much faster than option 1, but still uses a lot of memory:
Used Mem = 611
Calculation time = 0.278000116348
3. Defining a generator function (most efficient):
def f(n):
i = 0
while i < n:
yield i**2
i += 1
x = f(20000000)
Its consumption:
Used Mem = 0
Calculation time = 0.0
The questions are:
- What's the difference between the first and second solutions? Using
()
creates a generator, so why does it need a lot of memory? - Is there any built-in function equivalent to my third option?