32

Suppose I have the following in Python

# A loop
for i in range(10000):
    Do Task A

# B loop
for i in range(10000):
    Do Task B

How do I run these loops simultaneously in Python?

P Shved
  • 96,026
  • 17
  • 121
  • 165
hiiii
  • 329
  • 1
  • 3
  • 3

7 Answers7

38

If you want concurrency, here's a very simple example:

from multiprocessing import Process

def loop_a():
    while 1:
        print("a")

def loop_b():
    while 1:
        print("b")

if __name__ == '__main__':
    Process(target=loop_a).start()
    Process(target=loop_b).start()

This is just the most basic example I could think of. Be sure to read http://docs.python.org/library/multiprocessing.html to understand what's happening.

If you want to send data back to the program, I'd recommend using a Queue (which in my experience is easiest to use).

You can use a thread instead if you don't mind the global interpreter lock. Processes are more expensive to instantiate but they offer true concurrency.

Stefano Palazzo
  • 4,212
  • 2
  • 29
  • 40
  • I'm four minutes behind Odomontois. – Stefano Palazzo Aug 13 '10 at 09:30
  • 1
    @Mahi I'm guessing, but you need to put this into a file and run it. The block of code underneath `if __name__ == "__main__"` is run when the file is run (i.e. "if I am being executed as a program"). If you want to paste it into your python shell, just get rid of that `if` statement :) – Stefano Palazzo Aug 21 '17 at 13:32
  • I append time.time() at the end of each print() function and limit the loop to9 times. Here is what I see in my terminal: ('a', 1509314761.857559) ('a', 1509314761.857664) ('a', 1509314761.85767) ('a', 1509314761.857675) ('a', 1509314761.85768) ('a', 1509314761.857685) ('a', 1509314761.85769) ('a', 1509314761.857695) ('a', 1509314761.857699) ('b', 1509314761.858138) ('b', 1509314761.858224) ('b', 1509314761.858229) ('b', 1509314761.858234) ('b', 1509314761.858239) ('b', 1509314761.858244) ('b', 1509314761.858249) ('b', 1509314761.858253) ('b', 1509314761.858258) – weefwefwqg3 Oct 29 '17 at 22:10
  • So it's not really concurrent rite? one runs after the other. – weefwefwqg3 Oct 29 '17 at 22:11
  • @weefwefwqg3 Yes it is. Those processes are completely independent. Your operating system decides which process runs at what time. https://gist.github.com/sfstpala/5b1c0a647824b19a831d2b3dc3017cfd – Stefano Palazzo Oct 30 '17 at 13:16
  • I've tried copying and pasting the code above to test out multiprocessing as I'm hoping to use it for something, but I get the following error: AttributeError: Can't get attribute 'loop_b' on – Emi OB Jun 30 '21 at 14:51
19

There are many possible options for what you wanted:

use loop

As many people have pointed out, this is the simplest way.

for i in xrange(10000):
    # use xrange instead of range
    taskA()
    taskB()

Merits: easy to understand and use, no extra library needed.

Drawbacks: taskB must be done after taskA, or otherwise. They can't be running simultaneously.

multiprocess

Another thought would be: run two processes at the same time, python provides multiprocess library, the following is a simple example:

from multiprocessing import Process


p1 = Process(target=taskA, args=(*args, **kwargs))
p2 = Process(target=taskB, args=(*args, **kwargs))

p1.start()
p2.start()

merits: task can be run simultaneously in the background, you can control tasks(end, stop them etc), tasks can exchange data, can be synchronized if they compete the same resources etc.

drawbacks: too heavy!OS will frequently switch between them, they have their own data space even if data is redundant. If you have a lot tasks (say 100 or more), it's not what you want.

threading

threading is like process, just lightweight. check out this post. Their usage is quite similar:

import threading 


p1 = threading.Thread(target=taskA, args=(*args, **kwargs))
p2 = threading.Thread(target=taskB, args=(*args, **kwargs))

p1.start()
p2.start()

coroutines

libraries like greenlet and gevent provides something called coroutines, which is supposed to be faster than threading. No examples provided, please google how to use them if you're interested.

merits: more flexible and lightweight

drawbacks: extra library needed, learning curve.

cizixs
  • 12,931
  • 6
  • 48
  • 60
16

Why do you want to run the two processes at the same time? Is it because you think they will go faster (there is a good chance that they wont). Why not run the tasks in the same loop, e.g.

for i in range(10000):
    doTaskA()
    doTaskB()

The obvious answer to your question is to use threads - see the python threading module. However threading is a big subject and has many pitfalls, so read up on it before you go down that route.

Alternatively you could run the tasks in separate proccesses, using the python multiprocessing module. If both tasks are CPU intensive this will make better use of multiple cores on your computer.

There are other options such as coroutines, stackless tasklets, greenlets, CSP etc, but Without knowing more about Task A and Task B and why they need to be run at the same time it is impossible to give a more specific answer.

Dave Kirby
  • 25,806
  • 5
  • 67
  • 84
  • 4
    Just a little warning about the threading module. python has something it calls the Global Interpreter Lock (GIL). This locks out certain (large) areas of python from running at the same time even in separate threads. Multiprocessing doesn't have this issue - though it has its own bunch of pitfalls. – Michael Anderson Aug 13 '10 at 07:18
  • if speed is the issue, using some variant of `map`, a list comprehension or a generator expression is the way to go if it's feasible. – aaronasterling Aug 13 '10 at 07:22
9
from threading import Thread
def loopA():
    for i in range(10000):
        #Do task A
def loopB():
    for i in range(10000):
        #Do task B
threadA = Thread(target = loopA)
threadB = Thread(target = loobB)
threadA.run()
threadB.run()
# Do work indepedent of loopA and loopB 
threadA.join()
threadB.join()
Robert H
  • 11,520
  • 18
  • 68
  • 110
Odomontois
  • 15,918
  • 2
  • 36
  • 71
  • Could you kindly explain what is the `threadA.join()` and `threadB.join()` for? – alvas Aug 03 '15 at 10:15
  • @alvas [Thread.join](https://docs.python.org/3.4/library/threading.html#threading.Thread.join) is method allowing to wait for asynchronous work. Since we need results from both threads we could call it after all jobs are executed. – Odomontois Aug 03 '15 at 10:55
  • I understand .join is method allowing to wait for asynchronous work. What I dont understand is that I dont see what A is waiting for from the above example and what .join actually benefits above . – jason Feb 04 '21 at 15:56
1

You could use threading or multiprocessing.

Matt Curtis
  • 23,168
  • 8
  • 60
  • 63
  • 1
    They are both very different design approaches, it depends on your problem. Threading allows you to share in-process memory and resources, multiprocessing does not. – Matt Curtis May 18 '14 at 07:39
0

How about: A loop for i in range(10000): Do Task A, Do Task B ? Without more information i dont have a better answer.

PeterK
  • 6,287
  • 5
  • 50
  • 86
0

I find that using the "pool" submodule within "multiprocessing" works amazingly for executing multiple processes at once within a Python Script.

See Section: Using a pool of workers

Look carefully at "# launching multiple evaluations asynchronously may use more processes" in the example. Once you understand what those lines are doing, the following example I constructed will make a lot of sense.

import numpy as np
from multiprocessing import Pool

def desired_function(option, processes, data, etc...):
    # your code will go here. option allows you to make choices within your script
    # to execute desired sections of code for each pool or subprocess.

    return result_array   # "for example"


result_array = np.zeros("some shape")  # This is normally populated by 1 loop, lets try 4.
processes = 4
pool = Pool(processes=processes)
args = (processes, data, etc...)    # Arguments to be passed into desired function.

multiple_results = []
for i in range(processes):          # Executes each pool w/ option (1-4 in this case).
    multiple_results.append(pool.apply_async(param_process, (i+1,)+args)) # Syncs each.

results = np.array(res.get() for res in multiple_results)  # Retrieves results after
                                                           # every pool is finished!

for i in range(processes):
    result_array = result_array + results[i]  # Combines all datasets!

The code will basically run the desired function for a set number of processes. You will have to carefully make sure your function can distinguish between each process (hence why I added the variable "option".) Additionally, it doesn't have to be an array that is being populated in the end, but for my example, that's how I used it. Hope this simplifies or helps you better understand the power of multiprocessing in Python!

boardrider
  • 5,882
  • 7
  • 49
  • 86