I have a list of black-box functions (no critical data handling, no side-effects from calling, okay to terminate prematurely) which need to be executed and their return values aggregated:
aggregate = [result for fn in seq for result in fn()]
The functions generally return within a few microseconds, however there are edge cases which I cannot influence which cause some of them to occasionally raise exception, return invalid type (e.g. None
instead of iterable), or hang indefinitely.
To handle invalid output and exceptions, I wrote a decorator:
def safe_call(fn, valid_type, default, error_handler=None):
try:
val = fn()
if isinstance(val, valid_type):
return val
if error_handler is not None:
error_handler(TypeError("Invalid return type"))
except Exception as e:
if error_handler is not None:
error_handler(e)
return default
Now the only problem are hangs. Entire expression above should always be evaluated within milliseconds, and to ensure that I'm trying to implement a 1ms timeout for each function in the sequence.
I tried threading and multiprocessing, but I cannot properly terminate problematic threads, and overhead of starting a new process is too high and by itself exceeds the timeout value of 1 millisecond.
I tried using signals, benchmark indicating a 25% overhead on each call which is acceptable, but the evaluation is being done from a django's view which doesn't run in main thread, so signals don't work properly.
My current workaround is to have a separate process for each function constantly synchronizing with its parent process through a Queue, waiting to be called. This way, the parent can trigger the call on demand, wait 1ms, then check if the result has been sent back - saving the result if it has, or terminating the process otherwise.
It does work as far as performance is concerned, but there is 1400-1700% memory overhead from this solution which is not acceptable.
So how do I timeout a function call with a <1 ms precision without a huge memory overhead?