0

I've got a simple question. Essentially I want to run a function on a list of lists at the same time or in parallel.

Here is essentially what I'm working with

def func(x):
    print(x)
for objectList in space:
    for subObject in objectList:
        func(subObject)

I'd ideally like to be able to have func() run on every objectList at once. I know (actually I don't, which is why I'm asking how to do this) this is threading or parallelism but this is my first attempt at such.

My question is, is there a module or built in function in python or standard way to accomplish this task?

And sorry if I somehow broke a rule on this site or something or if you think this question is stupid or a duplicate or whatever.

martineau
  • 119,623
  • 25
  • 170
  • 301
randomUser
  • 11
  • 4
  • https://stackoverflow.com/questions/3329361/python-something-like-map-that-works-on-threads – Skam Apr 20 '19 at 21:09
  • even threading or parallelism doesn't work at once - there is always little delay between executions. In threads in Python sometimes it can take more time than running without thread. – furas Apr 20 '19 at 21:12
  • @furas: Sometimes with threads/parallelism it *does* happen at the same time (e.g. if the threads are running on independent CPUs, or independent cores of the same CPU). But with CPython that can be hard to achieve because of the GIL. – psmears Apr 20 '19 at 21:19
  • @psmears I also was thinking about problem with GIL :) – furas Apr 20 '19 at 21:22
  • Are you trying to run func() on every objectList in parallel, or are you trying to run func() on every subObject in all of the objectLists in parallel? – Kapocsi Apr 20 '19 at 21:28

2 Answers2

1

From the python docs:

the Pool object [...] offers a convenient means of parallelizing the execution of a function across multiple input values, distributing the input data across processes (data parallelism)

You could use this pattern, which is also from the docs:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))
Kapocsi
  • 922
  • 6
  • 17
0

As pointed out by others you should have a valid point to use threading and parellel computting as it also requiers other actions from you like input/load distribution and managing workers (working threads). There are certainly libraries that have built in those mechanics. For simple applications you can just use (threading.Thread) class. Here's a simple example of using threads. This code will create worker for every element of space.

import threading


def func(x):
    print(x)


class FuncThread(threading.Thread):
    def __init__(self, x):
        threading.Thread.__init__(self)
        self.x = x

    def run(self):
        #worker code = func(x)
        print('doing stuff with:', self.x)


space = [[1, 1, 1], [2, 2, 2], [3, 3, 3]]
worker_handlers = list()

for objectList in space:
    for subObject in objectList:
        worker_handlers.append(FuncThread(subObject))

for worker in worker_handlers:
    worker.start()

print('The main program continues to run in foreground.')

for worker in worker_handlers:
    worker.join()  # Wait for the background tasks to finish
print('Main program waited until background was done.')
mlotz
  • 130
  • 3