Yes, Thread = threading.Thread()
The Joiner is a way to tell each worker thread to go ahead an exit. If you write workers as consuming jobs, processing each one, then continuing, the worker will never exit. The Joiner is one way of flagging a worker to exit.
There are two best ways of getting a worker to exit. Either a) give it a fixed list of things to do, vs sending it a bunch of jobs in queue. Or, b) send each worker a "stop processing" special job flag. For instance if you send numbers to your workers, send it a None to get the worker to stop.
Here's an example of what you want. There's a fixed list of chunks, each of which has a list of tasks. Start 3 workers, divide the single chunk into 3 lists of tasks, and give each fixed-sized list to each worker. Wait for workers to finish their list, then continue.
For the following code we're counting vowels in words in a dictionary. Each 1000 words in the dictionary is a "chunk". Each worker processes about 300 words, runs in parallel, and exits after its list of words is done. After the first chunk of 1000 words is done, the program loops to handle another chunk of 1000 words.
Thus, each chunk is run sequentially, but each chunk is handled in 3 parallel pieces.
Have fun!
# break list of tasks into chunks. All tasks for a specific chunk run
# in parallel, but chunks are run sequentially.
import re
from threading import *
# dictionary with ~100K words, each on separate line
wordsf = open('/usr/share/dict/words')
def count_t(words):
num_vowels = 0
consonant_pat = re.compile('[^aeiou]', re.IGNORECASE)
for word in words:
# zap consonants, what's left is vowels
num_vowels += len( consonant_pat.sub('', word) )
print '\t{} vowels'.format(num_vowels)
while True:
# read 1000 words into a chunk
chunk = [ wordsf.readline() for _ in xrange(1000) ]
# exit when first word is empty (at end of file)
if not chunk or not chunk[0]:
break
print 'CHUNK {} to {}'.format(
chunk[0].rstrip(), chunk[-1].rstrip(),
)
# create 3 worker threads. Each gets a separate part of the chunk.
workers = [
Thread(target=count_t, args=(chunk[:300],)),
Thread(target=count_t, args=(chunk[300:600],)),
Thread(target=count_t, args=(chunk[600:],)),
]
# run this chunk's workers in parallel
for worker in workers:
worker.start()
# wait for them to complete
for worker in workers:
worker.join()
# continue to next chunk
Example output:
CHUNK A to Ashikaga
897 vowels
956 vowels
1338 vowels
CHUNK Ashikaga's to Boone's
909 vowels
841 vowels
996 vowels
Here are the last three chunks being processed:
CHUNK winter's to yum
798 vowels
734 vowels
825 vowels
CHUNK yummier to
364 vowels
0 vowels
0 vowels