1

I'm trying to understand how I can to realize multiprocessing in my case. I have two functions: def all_url() and def new_file(). The first one returns a list which contains a lot of elements. The second one uses this list for 'for' loop. I want to make multiprocessing for second function new_file() because list which returns from all_url() is too big.

Example of code:

def all_url():
   #Here I append elements to urllist
    return urllist

def new_file():
    for each in all_url():
      #There's a lot of code. Each iteration of loop creates a new html file.

new_file()
SBrain
  • 333
  • 3
  • 12
  • Maybe you should look into the [generator pattern in Python](https://wiki.python.org/moin/Generators). But the gist of it is implementing a __next__() function that returns the next element. – HermanTheGermanHesse Jul 30 '18 at 09:30

2 Answers2

1

You would need to do something like this:

from multiprocessing.pool import Pool

def all_url():
    #Here I append elements to urllist
    return urllist

def new_file():
    with Pool() as pool:
        pool.map(new_file_process_url, all_url())

def new_file_process_url(each):
      # Creates html file.

if __name__ == '__main__':
    new_file()
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • I got an error: RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module:if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. – SBrain Jul 30 '18 at 09:53
  • @SBrain Ah yes, you have to wrap the entry point of your program in a `if __name__ == '__main__':` (see [this question](https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing)). I've updated the answer. – jdehesa Jul 30 '18 at 09:57
  • It works. Thank you. Tell me please can I change number of processes? – SBrain Jul 30 '18 at 10:04
  • @SBrain Yes, have a look at the documentation of [`multiprocessing.pool.Pool`](https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool) to see all the options. You can pass the number of processes as the first parameter of the constructor (e.g. for 4 processes you would do `with Pool(4) as pool:`). – jdehesa Jul 30 '18 at 10:06
0

Not sure if you really need threading just because you have to wait for a huge list to get generated. Define the all_url function as a generator and call it in when needed.

def all_url():
    yield url # not urllist

url = all_url()

def new_file():
    current_url = next(url)
    # Do the rest 
Sleeba Paul
  • 613
  • 1
  • 6
  • 14