I think I must be missing something; this seems so right, but I can't see a way to do this.
Say you have a pure function in Python:
from math import sin, cos
def f(t):
x = 16 * sin(t) ** 3
y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
return (x, y)
is there some built-in functionality or library that provides a wrapper of some sort that can release the GIL during the function's execution?
In my mind I am thinking of something along the lines of
from math import sin, cos
from somelib import pure
@pure
def f(t):
x = 16 * sin(t) ** 3
y = 13 * cos(t) - 5 * cos(2*t) - 2 * cos(3*t) - cos(4*t)
return (x, y)
Why do I think this might be useful?
Because multithreading, which is currently only attractive for I/O-bound programs, would become attractive for such functions once they become long-running. Doing something like
from math import sin, cos
from somelib import pure
from asyncio import run, gather, create_task
@pure # releases GIL for f
async def f(t):
x = 16 * sin(t) ** 3
y = 13 * cos(t) - 5 * cos(2 * t) - 2 * cos(3 * t) - cos(4 * t)
return (x, y)
async def main():
step_size = 0.1
result = await gather(*[create_task(f(t / step_size))
for t in range(0, round(10 / step_size))])
return result
if __name__ == "__main__":
results = run(main())
print(results)
Of course, multiprocessing
offers Pool.map
which can do something very similar. However, if the function returns a non-primitive / complex type then the worker has to serialize it and the main process HAS to deserialize and create a new object, creating a necessary copy. With threads, the child thread passes a pointer and the main thread simply takes ownership of the object. Much faster (and cleaner?).
To tie this to a practical problem I encountered a few weeks ago: I was doing a reinforcement learning project, which involved building an AI for a chess-like game. For this, I was simulating the AI playing against itself for > 100,000
games; each time returning the resulting sequence of board states (a numpy
array). Generating these games runs in a loop, and I use this data to create a stronger version of the AI each time. Here, re-creating ("malloc
") the sequence of states for each game in the main process was the bottleneck. I experimented with re-using existing objects, which is a bad idea for many reasons, but that didn't yield much improvement.
Edit: This question differs from How to run functions in parallel? , because I am not just looking for any way to run code in parallel (I know this can be achieved in various ways, e.g. via multiprocessing
). I am looking for a way to let the interpreter know that nothing bad will happen when this function gets executed in a parallel thread.