Code looks something like this:
import multiprocessing as mp
from functools import partial
import boto3
import numpy as np
s3 = boto3.client('s3')
def _something(**kwargs):
# Some mixed integer programming stuff related to the variable archive
return np.array(some_variable_related_to_archive)
def do(s3):
archive = np.load(s3.get_object('some_key')) # Simplified -- details not relevant
pool = mp.pool()
sub_process = partial(_something, slack=0.1)
parts = np.array_split(archive, some_int)
target_parts = np.array(things)
out = pool.starmap(sub_process, [x for x in zip(parts, target_parts)] # Error occurs at this line
pool.close()
pool.join()
do(s3)
The error:
_pickle.PicklingError: Can't pickle <class 'botocore.client.S3'>: attribute lookup S3 on botocore.client failed
I have very limited experience with the Python multiprocessing library. I'm not sure why, when the S3 client is not a parameter in any of the functions, it throws the above error. Note that the code is able to run okay if the archive file is loaded from disk, and not from S3.
Any help/guidance would be greatly appreciated.