1

So I had the following code which works fine:

from  concurrent.futures import ProcessPoolExecutor 
import itertools

def grid_search_helper(vec_input):
    v1 = vec_input[0]
    v2 = vec_input[1]
    v3 = vec_input[2]
    d = {'v1' : v1, 'v2' : v2, 'v3' : v3}
    return(d)

idx = range(0,10)
cutoff = np.ndarray.tolist(np.arange(0.6,0.95,0.05))
opt = [2]

iters = itertools.product(idx, cutoff, opt)

with ProcessPoolExecutor(max_workers = 11) as executor:
        for  res in executor.map(grid_search_helper,iters):
            print(res)

Then I tried zip() to print the iterable that ProcessPoolExecuter is working on, however nothing is printed when I run the following code:

from  concurrent.futures import ProcessPoolExecutor 
import itertools

def grid_search_helper(vec_input):
    v1 = vec_input[0]
    v2 = vec_input[1]
    v3 = vec_input[2]
    d = {'v1' : v1, 'v2' : v2, 'v3' : v3}
    return(d)

idx = range(0,10)
cutoff = np.ndarray.tolist(np.arange(0.6,0.95,0.05))
opt = [2]

iters = itertools.product(idx, cutoff, opt)

with ProcessPoolExecutor(max_workers = 11) as executor:
        for  res, itr in zip(executor.map(grid_search_helper,iters), iters):
            print(res, itr)

I cannot figure out why. Can anybody help?

ABIM
  • 364
  • 3
  • 19
  • 1
    You can't iterate over `iters` (i.e. `itertools.product()`) twice. Either turn the iterator into a list, or re-create the iterator after `executor.map(grid_search_helper,iters)`. – Aran-Fey Aug 23 '18 at 17:31
  • 1
    You can duplicate your iterator with `itertools.tee`: `iter_a, iter_b = tee(iters)`. – Daniel Aug 23 '18 at 17:31
  • @Daniel: Can you write this as an answer so I can accept it? – ABIM Aug 23 '18 at 17:34

2 Answers2

2

It has nothing to do with the fact that you're zipping a function and an iterator.

The problem is you're using the same iterator twice:

#                                                      v       v
for res, itr in zip(executor.map(grid_search_helper, iters), iters):
    ...

The first time it's passed to map, it is consumed. By the time it is passed to zip again, it's already empty, so zip returns an empty generator, and there is nothing to iterate over.

Use itertools.tee to create two copies of the same iterator.

it1, it2 = itertools.tee(itertools.product(idx, cutoff, opt))

with ProcessPoolExecutor(max_workers = 11) as executor:
    for  res, itr in zip(executor.map(grid_search_helper,it1), it2):
        print(res, itr)
cs95
  • 379,657
  • 97
  • 704
  • 746
1

When it tries to execute zip(..., iters) the previously populated iters is already empty because executor.map(grid_search_helper, iters) has consumed all of its items.

So you're actually passing an empty iterator to zip().

vhcandido
  • 344
  • 2
  • 5