0

I have the following code


global total_pds
total_pds = []

ksplit = wr.s3.list_objects(pred_path)
ksplit = list(ksplit)

def process(x):
    dk = wr.s3.read_parquet(path = pred_path+x,dataset=False)
    return dk

def log_result(result):
    print(len(total_pds), end = ' ')
    total_pds.append(result)

def error_back(error):
    print('error', error)

pool = mp.Pool(processes=4,maxtasksperchild=10)
dcms_info = [pool.apply_async(process, args=(spl,), callback = log_result, error_callback = error_back) for spl in ksplit] 

for x in dcms_info:
    x.wait()

pool.close()
pool.join()

dataset = pd.concat(total_pds, ignore_index=True)

the last element throw me this error:

error("'i' format requires -2147483648 <= number <= 2147483647"

Thank you

FrankTan
  • 1,626
  • 6
  • 28
  • 63
  • 1
    Looks to me like it should work. Try adding an `error_callback` to see if any of the `process` calls raised an exception. – Thomas Mar 03 '21 at 08:47
  • @Thomas sorry the you are right error("'i' format requires -2147483648 <= number <= 2147483647" what dose mean that? – FrankTan Mar 03 '21 at 09:51
  • Does this answer your question? [Exception thrown in multiprocessing Pool not detected](https://stackoverflow.com/questions/6728236/exception-thrown-in-multiprocessing-pool-not-detected) – Aaron Mar 03 '21 at 15:17
  • @Aaron actually i don't think i have this error, the error is exactly this: error("'i' format requires -2147483648 <= number <= 2147483647" it seems connected with pandas number ? i don't know – FrankTan Mar 04 '21 at 10:23

0 Answers0