25

I have a series of 'tasks' that I would like to run in separate threads. The tasks are to be performed by separate modules. Each containing the business logic for processing their tasks.

Given a tuple of tasks, I would like to be able to spawn a new thread for each module as follows.

from foobar import alice, bob charles
data = getWorkData()
# these are enums (which I just found Python doesn't support natively) :(
tasks = (alice, bob, charles)

for task in tasks
  # Ok, just found out Python doesn't have a switch - @#$%!
  # yet another thing I'll need help with then ...
  switch
    case alice:
      #spawn thread here - how ?
      alice.spawnWorker(data)

No prizes for guessing I am still thinking in C++. How can I write this in a Pythonic way using Pythonic 'enums' and 'switch'es, and be able to run a module in a new thread.

Obviously, the modules will all have a class that is derived from a ABC (abstract base class) called Plugin. The spawnWorker() method will be declared on the Plugin interface and defined in the classes implemented in the various modules.

Maybe, there is a better (i.e. Pythonic) way of doing all this?. I'd be interested in knowing

[Edit]

I've just been reading a bot more and it seems Python does not implement threading in the true sense (at least, not in the sense that a C++ programmer would think). In any case thats not a show stopper for me. Each of the tasks are fairly time consuming, and I dont want to hold up starting one task until another has completed, thats why I am using threading. Time slicing does not bother me much - so long as they are all started pretty much at the same time (or shortly after each other) Python can then timeslice between the treads as much as it wants - its fine by me.

I have seen an answer to a similar question here on SO.

A user provides a simple class for threading as follows:

import threading
class Foo (threading.Thread):
    def __init__(self,x):
        self.__x = x
        threading.Thread.__init__(self)
    def run (self):
          print str(self.__x)

for x in xrange(20):
    Foo(x).start()

I am thinking of using this for my ABC Plugin. My question then is where do I put the code where the actual task gets done (i.e. the business logic). I assume this goes in the run() method of the Foo class (obvious question I know, but I dont want to make any assumptions).

Is my thinking on the right track or flawed (if flawed - what have I missed?)

morpheous
  • 16,270
  • 32
  • 89
  • 120
  • Instead of switch-case, why not use a proper polymorphism (ABC inheritance, or duck typing)? – Santa May 21 '10 at 15:49
  • @Santa: Good point. Thats how I would have done it (polymorphism) in C++. But I wasn't quite sure if Python supported that. – morpheous May 21 '10 at 17:08
  • @morpheous You'll find that, on top of the traditional inheritance-based polymorphism, Python also support more dynamic approaches to polymorphism, the most prominent of which is duck typing. – Santa May 21 '10 at 17:44

4 Answers4

42

Instead of switch-case, why not use a proper polymorphism? For example, here what you can do with duck typing in Python:

In, say, alice.py:

def do_stuff(data):
    print 'alice does stuff with %s' % data

In, say, bob.py:

def do_stuff(data):
    print 'bob does stuff with %s' % data

Then in your client code, say, main.py:

import threading
import alice, bob

def get_work_data():
    return 'data'

def main():
    tasks = [alice.do_stuff, bob.do_stuff]
    data = get_work_data()
    for task in tasks:
        t = threading.Thread(target=task, args=(data,))
        t.start()

Let me know if I need to clarify.

Santa
  • 11,381
  • 8
  • 51
  • 64
  • +1 your code is nice and simple - however, you are not passing the data to the spawned threads - could you please modify your code to show how data is passed to the spawned threads (like in my pseudocode)? tx – morpheous May 21 '10 at 23:14
  • 9
    Just a note that if `data` happens to be mutable, you'll want to either pass a copy to each Thread, or also pass a lock object (http://docs.python.org/library/threading.html#lock-objects). – tgray May 24 '10 at 13:07
5
import threading
from foobar import alice, bob, charles

data = get_work_data() # names_in_pep8 are more Pythonic than camelCased

for mod in [alice, bob, charles]:
    # mod is an object that represent a module
    worker = getattr(mod, 'do_work')
    # worker now is a reference to the function like alice.do_work
    t = threading.Thread(target=worker, args=[data])
    # uncomment following line if you don't want to block the program
    # until thread finishes on termination
    #t.daemon = True 
    t.start()

Put your logic in do_work functions of corresponding modules.

nkrkv
  • 7,030
  • 4
  • 28
  • 36
  • Good answer, but your last line should be `t.start()`. – tgray May 21 '10 at 15:42
  • +1 I really like this answer because it seems I can directly iterate over the modules (can you confirm that is the case? - that would be so cool). If the answer is yes, it means that so long as each module has a function called 'do_work', then the code above will spawn the threads and run the do_work() function in each of the modules in separate threads (is my understanding correct?). Looks like the last method invocation should be start() though right? – morpheous May 21 '10 at 23:20
  • @morpheus: You are correct. In Python, modules are also first-class objects. You can pass it to functions, put it in lists, etc. And yes, the thread should be sent the `start` method. – Santa May 23 '10 at 02:27
  • @santa, @tgray. Yep, there should be `start` instead of `run` – nkrkv May 24 '10 at 09:38
4

Sequential execution:

from foobar import alice, bob, charles

for fct in (alice, bob, charles):
    fct()

Parallel execution:

from threading import Thread
from foobar import alice, bob, charles

for fct in (alice, bob, charles):
    Thread(target=fct).start()
AstraSerg
  • 502
  • 10
  • 16
dugres
  • 12,613
  • 8
  • 46
  • 51
  • 1
    @nailxx the `run` method is where you define the work needed in that thread's execution. The `start` method is what you need to send to the thread object to do its `run` in a separate thread of execution. Otherwise, you're just running it in the current thread, therefore defeating the purpose of having a `Thread` defined to begin with. – Santa May 21 '10 at 16:09
  • @nailxx, I put a link to the documentation that explains that in a comment on your post. – tgray May 21 '10 at 18:08
1

Python can hold functions as objects. To overcome the limitation on lacking a switch may I suggest the following:

case_alice = lambda data : alice.spawnWorker(data)

my_dict[alice] = case_alice

forming a dictionary to hold your "case" statements.

Let me take it even further:

data = getWorkData()
case_alice = lambda d : alice.spawnWorker( d )
case_bob = lambda d : bob.spawnWorker( d )
case_charles = lambda d : charles.spawnWorker( d )

switch = { alice : case_alice, bob : case_bob, charles : case_charles }
spawn = lambda person : switch[ person ]( data )
[ spawn( item ) for item in (alice, bob, charles )]
wheaties
  • 35,646
  • 15
  • 94
  • 131
  • @wheaties: +1 for the Pythonic code (still reading it to make sure I understand all thats going on there). Could you possibly extend your sinppet a bit with how to use the multiprocessing module to actually spawn the threads (or processes - it doesn't matter) – morpheous May 21 '10 at 13:59
  • 4
    The `lambda` is wholly unnecessary. Simply do `my_dict[alice] = alice.spawnWorker`, and `[ switch[item](data) for item in ... ]` – Thomas Wouters May 21 '10 at 14:37
  • @morpheous No experience with the multiprocessing library. @Thomas Wouters You're absolutely correct. However, I think leaving them as they are is more conducive to understanding functions as first class objects. When I was first learning, seeing lambda reminded me of such. – wheaties May 21 '10 at 14:44
  • Still, `lambda` is for the most part unnecessary. I never used them in my projects, myself, and Guido does not seem to like it, either. – Santa May 21 '10 at 16:11
  • I'm not sure how using lambda to pretend the `spawnWorker` method isn't a first-class object helps with understanding that functions are first-class objects :) – Thomas Wouters May 22 '10 at 16:54
  • Those coming from say, Java or sometimes from C++, will see `spawnWorker` not as a function, per se, but rather as the object `spawnWorker` would return. – wheaties May 22 '10 at 18:52