1

I have some code that looks like this:

for item in list:
    <bunch of slow python code, depending only on item>

I want to speed this up by parallelizing the loop. Normally the multiprocessing module would be perfect for this (see the answers to this question), but it was added in python 2.6 and I'm stuck using 2.4.

What's the best way to parallelize a python loop in python 2.4?

Community
  • 1
  • 1
Dan
  • 12,157
  • 12
  • 50
  • 84
  • Are you really stuck in python 2.4? Even if you're using an old Linux system you can usually compile your own build of a later version of Python (heck, even Python 3.x), and then put it into a local directory so that it doesn't interfere with the system default. – HardlyKnowEm Jul 26 '13 at 00:05
  • Also, you can write the slow portion of the code in C and use threading. Threading doesn't suffer from the Global Interpreter Lock problem when you aren't using the interpreter in low-level C code. – HardlyKnowEm Jul 26 '13 at 00:07
  • @mlefavor: I've actually got python3.2 installed in my home directory, but the `` depends on a bunch of libraries (matplotlib, scipy, and some in-house stuff) I don't want to install myself. It's the same reason I don't want to rewrite it in C. – Dan Jul 26 '13 at 00:07

2 Answers2

1

You might be looking for "fork," which will make it easy to use the specific item.

Your for loop will need to look a little different, though -- you want to break out as soon as fork returns zero.

import os

L = ["a", "b", "c"]

for item in L:
    pid = os.fork()
    if pid == 0: break
    else: print "Forked:", pid

if pid != 0: print "Main Execution, Ends"
else: print "Execution:", item
LionKimbro
  • 777
  • 6
  • 16
  • This could cause a problem is the list is very long (hundreds of items in the list). Since this is normally the case, is there an easy way to ensure that only `n` forked processes are running at one time? – Dan Jul 26 '13 at 00:29
  • Figured it out; I can just use numpy.array_split(). – Dan Jul 26 '13 at 03:17
0

I'm not familiar with using python 2.4, but have you tried using subprocess.Popen and just spawning new processes?

from subprocess import Popen

for x in range(n):
    processes.append(Popen('python doWork.py',shell = True))

for process in processes: 
    process.wait()
sihrc
  • 2,728
  • 2
  • 22
  • 43