0

I want to use pool for Pandas data frames. I tried as follows, but the following error occurs. Can't I use pool for Series?

from multiprocessing import pool

split = np.array_split(split,4)
pool = Pool(processes=4)
df = pd.concat(pool.map(split['Test'].apply(lambda x : test(x)), split))
pool.close()
pool.join()

The error message is as follows.

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str

1 Answers1

1

Try:

import pandas as pd
import numpy as np
import multiprocessing as mp

def test(x):
    return x * 2

if __name__ == '__main__':
    # Demo dataframe
    df = pd.DataFrame({'Test': range(100)})

    # Extract the Series and split into chunk
    split = np.array_split(df['Test'], 4)

    # Parallel processing
    with mp.Pool(4) as pool:
        data = pool.map(test, split)

    # Concatenate results
    out = pd.concat(data)

Output:

>>> df
    Test
0      0
1      1
2      2
3      3
4      4
..   ...
95    95
96    96
97    97
98    98
99    99

[100 rows x 1 columns]

>>> out
0       0
1       2
2       4
3       6
4       8
     ... 
95    190
96    192
97    194
98    196
99    198
Name: Test, Length: 100, dtype: int64
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • Thank you for your help but it didnt't work for me.. – purelyawwesome Feb 19 '23 at 19:28
  • Maybe you could explain why? What is the error message? – Corralien Feb 19 '23 at 19:29
  • File "C:\Python310\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "C:\Python310\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "C:\Python310\lib\multiprocessing\pool.py", line 114, in worker task = get() File "C:\Python310\lib\multiprocessing\queues.py", line 368, in get return _ForkingPickler.loads(res) AttributeError: Can't get attribute 'test' on – purelyawwesome Feb 20 '23 at 00:58
  • The above error occurs when I execute the code you gave me an example of and when I execute my code. – purelyawwesome Feb 20 '23 at 00:59
  • Did you rename the `test` function? – Corralien Feb 20 '23 at 08:18