3

I came across a scenario where i need to run the function parallely for a list of values in python. I learnt executor.map from concurrent.futures will do the job. And I was able to parallelize the function using the below syntax executor.map(func,[values]).

But now, I came across the same scenario (i.e the function has to run parallely), but then the function signature is different from the previous and its given below.

  def func(search_id,**kwargs):
      # somecode
      return list

  container = []
  with concurrent.futures.ProcessPoolExecutor() as executor:
       container.extend(executor.map(func, (searchid,sitesearch=site),[list of sites]))

I don't know how to achieve the above. Can someone guide me please?

James K J
  • 605
  • 1
  • 8
  • 20
  • [`**kwargs`](https://stackoverflow.com/questions/1769403/what-is-the-purpose-and-use-of-kwargs) is just a way to take optional named input. If you don't have any specific keyword args to pass in, just don't pass it in. Otherwise, `kwargs` should be a dictionary – sshashank124 Dec 29 '19 at 14:46
  • Are there any arguments that get passed as kwargs that are required? – Iain Shelvington Dec 29 '19 at 14:48
  • @sshashank124, I have updated specific arguments . Can you look at it please. – James K J Dec 29 '19 at 14:51
  • @IainShelvington. You meant to ask are they any arguments in kwargs which are required ?. Yes, I have updated the question. – James K J Dec 29 '19 at 14:52
  • Look at the following question for details on how to pass a kwargs dictionary as an argument: https://stackoverflow.com/questions/1769403/what-is-the-purpose-and-use-of-kwargs – sshashank124 Dec 29 '19 at 14:54
  • what are you trying to parallelize? `executor.map(func, iterable)` will apply the function to each item in the iterable. So I am not sure that is what you want. – abhilb Dec 29 '19 at 14:54
  • Where is your iterable of values to pass to `map`? – Iain Shelvington Dec 29 '19 at 14:59
  • @IainShelvington. Sorry I will update the question. – James K J Dec 29 '19 at 15:00
  • @abhilb, Apologies for incomplete question. I beg your pardon – James K J Dec 29 '19 at 15:02

2 Answers2

6

If you have an iterable of sites that you want to map and you want to pass the same search_term and pages argument to each call. You can use zip to create an iterable that returns tuples of 3 elements where the first is your list of sites and the 2nd and 3rd are the other parameters just repeating using itertools.repeat

def func(site, search_term, pages):
    ...

from functools import partial
from itertools import repeat
executor.map(func, zip(sites, repeat(search_term), repeat(pages)))
Iain Shelvington
  • 31,030
  • 3
  • 31
  • 50
  • What if sitesearch is different ? I mean , can the below be done executor.map(partial(func, pages=10), search_ids,sitesearch) – James K J Dec 29 '19 at 15:06
  • What is `sitesearch`? Is it a list of sites? How does each element relate to the elements in `search_ids`? – Iain Shelvington Dec 29 '19 at 15:12
  • I have list of sites. Now i pass the search term and a site from list of sites to that function. site is basically the url. So, i pass a search term and a url from the list of urls to that function. – James K J Dec 29 '19 at 15:13
  • Okay, I'm not exactly sure what you mean. Where is this search term coming from? – Iain Shelvington Dec 29 '19 at 15:16
  • To make it clear. Let say i want to search for term tourism in sites google,facebook and twitter, previously what i was doing was calling the function and passing the search term which is tourism and the site google.com and get the result. Again i call the function with same search term for site facebook and for site twitter as well – James K J Dec 29 '19 at 15:20
  • So you want to map the function method with the same search_term and pages parameters but with an interable of sites? – Iain Shelvington Dec 29 '19 at 15:22
  • Yes. Exactly. Finally you understood – James K J Dec 29 '19 at 15:23
  • @s326280 no problem, hope it helps – Iain Shelvington Dec 29 '19 at 15:35
1

Here is a useful way of using kwargs in executor.map, by simply using a lambda function to pass the kwargs with the **kwargs notation:

from concurrent.futures import ProcessPoolExecutor

def func(arg1, arg2, ...):
    ....
 
items = [
    { 'arg1': 0, 'arg2': 3 },
    { 'arg1': 1, 'arg2': 4 },
    { 'arg1': 2, 'arg2': 5 }
]

with ProcessPoolExecutor() as executor:
    result = executor.map(
        lambda kwargs: func(**kwargs), items)

I found this also useful when using Pandas DataFrames by creating the items with to_dict by typing items = df.to_dict(orient='records') or loading data from JSON files.