1

I am using a Flask server to handle requests for some image-processing tasks.

The processing relies extensively on OpenCV and I would now like to trivially-parallelize some of the slower steps.

I have a preference for multiprocessing rather than multithreading (please assume the former in your answers).

But multiprocessing with opencv is apparently broken (I am on Python 2.7 + macOS): https://github.com/opencv/opencv/issues/5150

One solution (see https://github.com/opencv/opencv/issues/5150#issuecomment-400727184) is to use the excellent Loky (https://github.com/tomMoral/loky)

[Question: What other working solutions exist apart from concurrent.futures, loky, joblib..?]

But Loky leads me to the following stacktrace:

    a,b = f.result()
  File "/anaconda2/lib/python2.7/site-packages/loky/_base.py", line 433, in result
    return self.__get_result()
  File "/anaconda2/lib/python2.7/site-packages/loky/_base.py", line 381, in __get_result
    raise self._exception
BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

This was caused directly by
'''
Traceback (most recent call last):
  File "/anaconda2/lib/python2.7/site-packages/loky/process_executor.py", line 391, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
  File "/anaconda2/lib/python2.7/multiprocessing/queues.py", line 135, in get
    res = self._recv()
  File "myfile.py", line 44, in <module>
    app.config['EXECUTOR_MAX_WORKERS'] = 5
  File "/anaconda2/lib/python2.7/site-packages/werkzeug/local.py", line 348, in __getattr__
    return getattr(self._get_current_object(), name)
  File "/anaconda2/lib/python2.7/site-packages/werkzeug/local.py", line 307, in _get_current_object
    return self.__local()
  File "/anaconda2/lib/python2.7/site-packages/flask/globals.py", line 52, in _find_app
    raise RuntimeError(_app_ctx_err_msg)
RuntimeError: Working outside of application context.

This typically means that you attempted to use functionality that needed
to interface with the current application object in some way. To solve
this, set up an application context with app.app_context().  See the
documentation for more information.
'''

The functions to be parallelized are not being called from app/main.py, but rather from an abitrarily-deep submodule.

I have tried the similarly-useful-looking https://flask-executor.readthedocs.io/en/latest, also so far in vain.

So the question is:

How can I safely pass the application context through to the workers or otherwise get multiprocessing working (without recourse to multithreading)?

I can build out this question if you need more information. Many thanks as ever.

Related resources:

Copy flask request/app context to another process

Flask Multiprocessing

Update:

Non-opencv calls work fine with flask-executor (no Loky) :)

The problem comes when trying to call an opencv function like knnMatch.

If Loky fixes the opencv issue, I wonder if it can be made to work with flask-executor (not for me, so far).

jtlz2
  • 7,700
  • 9
  • 64
  • 114

0 Answers0