You were telling Python to execute the function do_work()
, and to then pass whatever that function returned, to executor.do_work()
:
executor.submit(do_work(count))
It might be easier for you to see this if you used a variable to hold the result of do_work()
. The following is functionally equivalent to the above:
do_work_result = do_work(count)
executor.submit(do_work_result)
In Python, functions are first-class objects; using just the name do_work
you are referencing the function object. Only adding (...)
to an expression that produces a function object (or another callable object type) causes something to be executed.
In the form
executor.submit(do_work, count)
you do not call the function. You are passing in the function object itself as the first argument, and count
as the second argument. The executor.submit()
function accepts callable objects and their arguments to then later on run those functions in parallel, with the arguments provided.
This allows the ThreadPoolExecutor
to take that function reference and the single argument and only call the function in a new thread, later on.
Because you were calling the function first, you had to wait for each function to complete first as you called it sequentially before adding. And because the functions return None
, you were adding those None
references to executor.submit()
, and would have seen a TypeError
exception later on to tell you that 'NoneType' object is not callable
. That happens because the threadpool executor tried to use None()
, which doesn't work because indeed, None
is not a callable.
Under the hood, the library essentially does this:
def submit(self, fn, *args, **kwargs):
# record the function to be called as a work item, with other information
w = _WorkItem(..., fn, args, kwargs)
self._work_queue.put(w)
so a work item referencing the function and arguments is added to a queue. Worker threads are created which take items from the queue again it is taken from the queue (in another thread, or a child process), the _WorkItem.run()
method is called, which runs your function:
result = self.fn(*self.args, **self.kwargs)
Only then the (...)
call syntax is used. Because there are multiple threads, the code is executed concurrently.
You do want to read up on how pure Python code can't run in parallel, only concurrently: Does Python support multithreading? Can it speed up execution time?
Your do_work()
functions only run 'faster' because time.sleep()
doesn't have to do any actual work, apart from telling the kernel to not give any execution time to the thread the sleep was executed on, for the requested amount of time. You end up with a bunch of threads all asleep. If your workers had to execute Python instructions, then the total time spent on running these functions concurrently or sequentially would not differ all that much.