I'm trying to use multiprocessing for a function that can potentially return a segfault (I have no control over this ATM). In cases where the child process hits a segfault, I want only that child to fail, but all other child tasks to continue/return their results.
I've already switched from multiprocessing.Pool
to concurrent.futures.ProcessPoolExecutor
avoid the issue of the child process hanging forever (or until an arbitrary timeout) as documented in this bug: https://bugs.python.org/issue22393.
However the issue I face now, is that when the first child task hits a segfault, all in-flight child processes get marked as broken (concurrent.futures.process.BrokenProcessPool
).
Is there a way to only mark actually broken child processes as broken?
Code I'm running in Python 3.7.4
:
import concurrent.futures
import ctypes
from time import sleep
def do_something(x):
print(f"{x}; in do_something")
sleep(x*3)
if x == 2:
# raise a segmentation fault internally
return x, ctypes.string_at(0)
return x, x-1
nums = [1, 2, 3, 1.5]
executor = concurrent.futures.ProcessPoolExecutor()
result_futures = []
for num in nums:
# Using submit with a list instead of map lets you get past the first exception
# Example: https://stackoverflow.com/a/53346191/7619676
future = executor.submit(do_something, num)
result_futures.append(future)
# Wait for all results
concurrent.futures.wait(result_futures)
# After a segfault is hit for any child process (i.e. is "terminated abruptly"), the process pool becomes unusable
# and all running/pending child processes' results are set to broken
for future in result_futures:
try:
print(future.result())
except concurrent.futures.process.BrokenProcessPool:
print("broken")
Result:
(1, 0)
broken
broken
(1.5, 0.5)
Desired result:
(1, 0)
broken
(3, 2)
(1.5, 0.5)