2

I'd like to create a multi-threaded version of a function. I discover t.start() returns None, so I have to use queue. I searched the documentation, but I don't understand how to use it in my example.

This is the function:

def derivative(lst, var):  # Example of lst =  [1 + [3 * x]]
    if len(lst) == 1:       
        return derive_solver(lst[0], var)

    if lst[1] == '+':
        return [derivative(lst[0], var), '+', derivative(lst[2], var)]

    if lst[1]  == '*':
        return [[derivative(lst[0], var), '*', lst[2]], '+', [lst[0], '*', derivative(lst[2], var)]]

And this my attempt to multi-thread that function:

def derivative(lst, var):  # Example of lst =  [1 + [3 * x]]
    if len(lst) == 1:       
        return derive_solver(lst[0], var)

    if lst[1] == '+':
        t1 = threading.Thread(target = derivative,  args=(lst[0], var))
        t2 = threading.Thread(target = derivative,  args=(lst[2], var))
        return [t1.start(), '+', t2.start()]

    if lst[1]  == '*':
        t1 = threading.Thread(target = derivative,  args=(lst[0], var))
        t2 = threading.Thread(target = derivative,  args=(lst[2], var))
        return [[t1.start(), '*', lst[2]], '+', [lst[0], '*', t2.start()]] 

The problem is that t1.start() doesn't return values...

Have you any idea how to solve this using queue?

Thank you!

nettux
  • 5,270
  • 2
  • 23
  • 33
Matteo
  • 59
  • 1
  • 1
  • 5

1 Answers1

5

The problem is that t1.start() doesn't return values...

Of course not. t1 hasn't finished at this point. If start waited for the background thread to finish, there would be absolutely no reason to use threads in the first place.

You need to set things up so the background threads post their work somewhere and signal you that they're done, then wait until both threads have signaled you. A queue is one way to do that. So is a shared variable plus a Condition. Or, in this case, just a shared variable plus joining the thread. But I'll show one way to do it with a queue, since that's what you asked for:

def enthread(target, args):
    q = queue.Queue()
    def wrapper():
        q.put(target(*args))
    t = threading.Thread(target=wrapper)
    t.start()
    return q

q1 = enthread(target = derivative,  args=(lst[0], var))
q2 = enthread(target = derivative,  args=(lst[2], var))
return [q1.get(), '+', q2.get()]

What I did there is to create a queue, pass it into the target function for the background thread (which wraps the real target function), and have the background thread put its result on the queue. Then, the main thread can just wait on the queue.

Note that this isn't joining each thread, which can be a problem. But hopefully you can see how to expand on the code to make it more robust.

Also note that we're explicitly waiting for thread 1 to finish before checking on thread 2. In a situation where you can't do anything until you have all the results anyway, that's fine. But in many applications, you'll want a single queue, so you can to pick up the results as they come in (tagging the values in some way if you need to be able to reconstruct the original order).


A much better solution is to use a higher-level abstraction, like a thread pool or a future (or an executor, which combines both abstractions into one). But it's worth understanding how these pieces work first, then learning how to do things the easy way. So, once you understand why this works, go read the docs on concurrent.futures.


Finally, assuming you're using CPython or another GIL-based implementation—which you probably are—and that derive_solver function isn't a C extension function explicitly designed to do most of its work without the GIL, this isn't going to be a good idea in the first place. Threads are great when you need concurrency without parallelism (because your code is simpler that way, or because it's I/O bound), but when you're actually trying to benefit from multiple cores, they aren't the answer, because only one thread can run the interpreter at the time. Use multiprocessing (or just concurrent.futures.ProcessPoolExecutor instead of concurrent.futures.ThreadPoolExecutor) if you need parallelism.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • You code example is confusing. Why is `return` outside of the function? – minerals Mar 13 '15 at 18:11
  • @minerals: This example is a fragment of the code that goes inside the OP's derivative function. All of it, including the local `enthread` definition. (Since `enthread` doesn't need to close over any locals here, you could move it out to the top level if you wanted to reuse it, but leaving it here avoids polluting the namespace with a function that you aren't going to reuse otherwise.) – abarnert Mar 13 '15 at 21:21
  • Thread's without a return it's near to useless and a waste of coffe loops. I've spent so much coffe on determine that use queue it's a must for serious coding. I've tried to make some shortcuts but after all you should use queue or an external method, wich would be slower for sure. – m3nda Oct 23 '15 at 15:07
  • `queue.Queue` is not really needed. A simple `[]` would do. – ivan_pozdeev Oct 24 '16 at 19:21