0

Im implementing multi threaded functionality using the design pattern discussed in this answer and this blog post.

My data source is dynamic (comes from web POSTs) but all the examples I can find uses a static data source.

My question is therefore: How can I implement multi threaded functionality with a non-static data input source?

Example-code:

import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

#  This is a static data source. I need it to be dynamic!
urls = [
  'http://www.python.org', 
  'http://www.python.org/about/',
  # etc.. 
  ]

# Make the Pool of workers
pool = ThreadPool(4) 
# Open the urls in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)
#close the pool and wait for the work to finish 
pool.close() 
pool.join() 
Community
  • 1
  • 1
Vingtoft
  • 13,368
  • 23
  • 86
  • 135

1 Answers1

1

I suppose that by non-static you mean a source that produces its items while the workers are already consuming them. It this case, you can use the Pool.imap() API and pass a generator instead of a prepared list.

import multiprocessing.dummy
import threading


def generate():
    for i in range(20):
        print('generating {}'.format(i))
        yield i


def consume(i):
    print('thread {} consuming {}'.format(threading.current_thread().ident, i))


pool = multiprocessing.dummy.Pool(4)

list(pool.imap(consume, generate()))

Be careful to actually consume the iterable that is the return value of the Pool.imap() call.

Thomas Lotze
  • 5,153
  • 1
  • 16
  • 16