21

I searched everywhere and I don't find any simple example of iterating a loop with multithreading.

For example, how can I multithread this loop?

for item in range(0, 1000):
    print(item)

Is there any way to cut it in like 4 threads, so each thread has 250 iterations?

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
Anthony
  • 804
  • 3
  • 12
  • 32

3 Answers3

27

Easiest way is with multiprocessing.dummy (which uses threads instead of processes) and a Pool

import multiprocessing.dummy as mp 

def do_print(s):
    print s

if __name__=="__main__":
    p=mp.Pool(4)
    p.map(do_print,range(0,10)) # range(0,1000) if you want to replicate your example
    p.close()
    p.join()

Maybe you want to try real multiprocessing, too if you want to better utilize multiple CPUs but there are several caveats and guidelines to follow then.

Possibly other methods of Pool would better suit your needs - depending on what you are actually trying to do.

Yui
  • 57
  • 2
  • 9
janbrohl
  • 2,626
  • 1
  • 17
  • 15
  • Thanks! What if I need to pass other arguments to the `do_print` function? – Anthony Aug 09 '16 at 16:54
  • that depends - with multiprocessing.dummy there should be no problems but if you want to make sure you should put the args in the iterable - eg with [zip](https://docs.python.org/2.7/library/functions.html#zip) or [list comprehensions](https://docs.python.org/2.7/tutorial/datastructures.html#list-comprehensions) – janbrohl Aug 09 '16 at 17:03
  • For some reason, this code creates about 60 (!) threads to iterate over `range(0, 1000)` for me (Python 3.5)... – ForceBru Aug 09 '16 at 17:26
  • @janbrohl: To avoid unnecessary temporaries, you'd want to use the Py3's `zip` (on Py2, [replace bad `zip` with good `zip` using `from future_builtins import zip`](https://docs.python.org/2/library/future_builtins.html#future_builtins.zip)) or a [generator expression](https://docs.python.org/2/howto/functional.html#generator-expressions-and-list-comprehensions), so the arguments are produced lazily (as needed) instead of eagerly (delaying dispatch of first call and wasting memory). – ShadowRanger Aug 09 '16 at 17:31
  • @ForceBru: If you used `multiprocessing.dummy`, that should be impossible; the `Pool` constructor given is only using four worker threads. – ShadowRanger Aug 09 '16 at 17:33
  • @ShadowRanger, it should be impossible, but occurs when I use `imap` or `imap_unordered` instead of `map`. The first spawns around 10 threads (already quite odd), and the second one about 60 of them. – ForceBru Aug 09 '16 at 17:37
  • @ForceBru what's the actual code you are using so we could replicate what you see? – miraculixx Aug 09 '16 at 17:51
  • @miraculixx, the code in this answer, but with `map` replaced with `imap` or `imap_unordered` and `range(0, 1000)`. – ForceBru Aug 09 '16 at 17:52
  • @ForceBru how did you check the number of threads being created? – Sundeep Pidugu Aug 01 '19 at 06:43
  • @SundeepPidugu, probably using [`threading.active_count`](https://docs.python.org/3/library/threading.html#threading.active_count). – ForceBru Aug 01 '19 at 09:37
  • Can you please answer answer on this link : [https://stackoverflow.com/questions/68724573/using-parallel-method-instead-of-sequential-loop] – Aadhi Verma Aug 10 '21 at 11:16
  • very cool trick and very easy to set since is the same as multithread or multicore =) – pelos Apr 07 '22 at 15:23
10

You'll have to do the splitting manually:

import threading

def ThFun(start, stop):
    for item in range(start, stop):
        print item

for n in range(0, 1000, 100):
    stop = n + 100 if n + 100 <= 1000 else 1000
    threading.Thread(target = ThFun, args = (n, stop)).start()

This code uses multithreading, which means that everything will be run within a single Python process (i.e. only one Python interpreter will be launched).

Multiprocessing, discussed in the other answer, means running some code in several Python interpreters (in several processes, not threads). This may make use of all the CPU cores available, so this is useful when you're focusing on the speed of your code (print a ton of numbers until the terminal hates you!), not simply on parallel processing. 1


1. multiprocessing.dummy turns out to be a wrapper around the threading module. multiprocessing and multiprocessing.dummy have the same interface, but the first module does parallel processing using processes, while the latter - using threads.

Community
  • 1
  • 1
ForceBru
  • 43,482
  • 10
  • 63
  • 98
  • multiprocessing.dummy actually uses threads - edited my answer to make that more clear – janbrohl Aug 09 '16 at 17:10
  • `multiprocessing.dummy` is indeed a wrapper around the `threading` module, but `multiprocessing` itself works exactly as described in the crossed-out part of the answer. Which one is chosen depends on what behavior one wants, the code is the very same (that's a good thing). – miraculixx Aug 09 '16 at 17:54
  • @miraculixx, this is why I decided not to remove the now crossed-out part completely: `multiprocessing` and `multiprocessing.dummy` have similar interfaces, but the choice of a module affects the behavior (threads vs processes). – ForceBru Aug 09 '16 at 17:57
  • @ForceBru I see, the problem is that it now looks as though what you wrote is incorrect, which it isn't, possibly confusing someone looking for multiprocessing ... `multiprocessing` and `multiprocessing.dummy` have the _exact same_ interface, that's the whole point. – miraculixx Aug 09 '16 at 18:00
  • @miraculixx: Not exactly, but close enough, and they try to keep them consistent. `multiprocessing.dummy` accidentally omitted the `cpu_count` function for example. – ShadowRanger Aug 09 '16 at 19:14
  • can you please answer on this link : https://stackoverflow.com/questions/68724573/using-parallel-method-instead-of-sequential-loop ? – Aadhi Verma Aug 10 '21 at 11:16
6

Since Python 3.2, the concurrent.futures standard library provides primitives to concurrently map a function across iterables. Since map and for are closely related, this allows to easily convert a for loop into a multi-threaded/multi-processed loop:

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor() as executor:
    executor.map(print, range(0, 1000))
MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119