0

I'm trying to do some multiprocessing in python and have a problem with datatypes getting changed while using Pool.starmap. See snippet below:

import multiprocessing as mp
from itertools import repeat
with mp.Pool(3) as pool:
        pool.starmap(some function, zip([np.array, np.array], repeat(pd.dataframe)))

Before passing the np.array to the the function it is something like:

['some string', some int, some float, 'some string']

But after passing it somehow gets formatted to:

['some string', 'some string', 'some string', 'some string']

Has anyone experienced similar problems so far? Cheers

  • What is `some function`? That seems like a likely culprit. – Carcigenicate Apr 07 '21 at 13:56
  • The function just does some math with the array and the df. Works totally fine when calling it directly. – JoeTo1311 Apr 07 '21 at 13:59
  • Numpy arrays are homogenous, the numbers are converted to strings before any multiprocessing. `np.array(['a',1,1.2])` --> `array(['a', '1', '1.2'], dtype=' – wwii Apr 07 '21 at 18:47
  • Does [Python - strings and integers in Numpy array](https://stackoverflow.com/questions/44831502/python-strings-and-integers-in-numpy-array) answer your question? – wwii Apr 07 '21 at 18:58
  • thanks, you helped me a lot. Found a way to use dtype=int for the array instead of object and now it works! – JoeTo1311 Apr 09 '21 at 08:48

0 Answers0