0

I'm learning python multithreading and queues. The following creates a bunch of threads that pass data through a queue to another thread for printing:

import time
import threading
import Queue

queue = Queue.Queue()

def add(data):
    return ["%sX" % x for x in data]

class PrintThread(threading.Thread):
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self.queue = queue

    def run(self):
        data = self.queue.get()
        print data
        self.queue.task_done()

class MyThread(threading.Thread):
    def __init__(self, queue, data):
        threading.Thread.__init__(self)
        self.queue = queue
        self.data = data

    def run(self):
        self.queue.put(add(self.data))

if __name__ == "__main__":
    a = MyThread(queue, ["a","b","c"])
    a.start()
    b = MyThread(queue, ["d","e","f"])
    b.start()
    c = MyThread(queue, ["g","h","i"])
    c.start()

    printme = PrintThread(queue)
    printme.start()

    queue.join()

However, I see only the data from the first thread print out:

['aX', 'bX', 'cX']

Then nothing else, but the program doesn't exit. I have to kill the process to have it exit.

Ideally, after each MyThread does it data processing and puts the result to the queue, that thread should exit? Simultaneously the PrintThread should take whatever is on the queue and print it.

After all MyThread threads have finished and the PrintThread thread has finished processing everything on the queue, the program should exit cleanly.

What have I done wrong?

EDIT:

If each MyThread thread takes a while to process, is there a way to guarantee that the PrintThread thread will wait for all the MyThread threads to finish before it will exit itself?

That way the print thread will definitely have processed every possible data on the queue because all the other threads have already exited.

For example,

class MyThread(threading.Thread):
    def __init__(self, queue, data):
        threading.Thread.__init__(self)
        self.queue = queue
        self.data = data

    def run(self):
        time.sleep(10)
        self.queue.put(add(self.data))

The above modification will wait for 10 seconds before putting anything on the queue. The print thread will run, but I think it's exiting too early since there is not data on the queue yet, so the program prints out nothing.

warchest
  • 397
  • 2
  • 3
  • 9

1 Answers1

0

Your PrintThread does not loop but instead only prints out a single queue item and then stops running.

Therefore, the queue will never be empty and the queue.join() statement will prevent the main program from terminating

Change the run() method of your PrintThread into the following code in order to have all queue items processed:

try:
    while True:
        data = self.queue.get_nowait()
        print data
        self.queue.task_done()
except queue.Empty:
    # All items have been taken off the queue
    pass
DocZerø
  • 8,037
  • 11
  • 38
  • 66
  • How do you solve the problem where the `MyThread` threads all take 5+ seconds to process? For example, if we put `time.sleep(20)` after `run()` in `MyThread`, the end result is empty; nothing prints out. Is there a way to guarantee that `PrintThread` is the last thread to exit? That way all data is guaranteed to be in the queue and the PrintThread as a chance to process them, regardless of the time it takes the other threads to process the data. – warchest Jul 23 '16 at 20:22
  • You would need to keep the `PrintThread` running, e.g. by creating it as a [daemon thread](https://docs.python.org/3.5/library/threading.html#threading.Thread.daemon). Another way is to use [`threading.Event`](https://docs.python.org/3.5/library/threading.html#threading.Event) to signal a thread to stop from the main program (as the thread itself doesn't know whether to expect more items on the queue or not). – DocZerø Jul 23 '16 at 20:31
  • I set `printme.setDaemon(True)`, but the program is printing some weird output. For example I see broken lists like `['gX', 'hX'` or `['aX', 'bX', 'cX'] ['gX',`. Is this a problem? – warchest Jul 23 '16 at 20:42
  • Using `print` in Python 2 isn't thread-safe. See [this](http://stackoverflow.com/questions/3029816/how-do-i-get-a-thread-safe-print-in-python-2-6) and [this](http://stackoverflow.com/questions/7877850/python-2-7-print-thread-safe?noredirect=1&lq=1) SO question. – DocZerø Jul 23 '16 at 20:48