How should I manage a parallel optimization of an instance method using ThreadPoolExecutor?

Question

Useless intro

Hi everyone,

I'm having trouble with parallel use of nevergrad library. However, my question is about my implementation. It seems my OOP skills are not good. As the code is too long to reproduce here I will hypothesize while being specific.

The problem setup

First, I have a class that inherits from Algorithm, let's say it is Rotation. Second, I created an Optimizer class that helps me optimize any algorithm,

class Optimizer:
  def __init__(self, algorithm):
    self.algorithm = algorithm

and one can define

cfg = some_dict  # configuration dict
rot_algorithm = Rotation(cfg)  # an algorithm instance
rot_optimizer = Optimizer(rot_algorithm)  # an optimizer instance that works over an algorithm instance

Now, two more things:

Inside Optimizer there is a method optimize_params that uses concurrent.futures.ThreadPoolExecutor to optimize another method named Optimizer.obj_fun (see the end of the question).
The algorithms have, in general, internal variables that modify their behavior. For example, we could imagine that rot_algorithm has an internal variable named angle that will be modified during the optimization.

The question

I found that rot_algorithm.angle was a variable that the threads had in common, that is, it could be modified by some thread while another had already set the value. That makes the results useless. How should I refactor my code in order to avoid this behavior?

Use of `ThreadPoolExecutor`

        with futures.ThreadPoolExecutor(
            max_workers=optimizer.num_workers
        ) as executor:
            r = optimizer.minimize(
                self.obj_fun,
                executor=executor,
                batch_mode=False,
            )

You need to figure out a way to present the section of code that you think is the cause of the problem. — Paul Cornelius, Jul 06 '20 at 22:36
You are aware of [thread local data](https://docs.python.org/3/library/threading.html#thread-local-data)? — wwii, Jul 06 '20 at 22:47
@PaulCornelius I know, that was the best I could do today. If I manage to reproduce the problem I will update the question immediately. — Franco Marchesoni, Jul 06 '20 at 23:17
@wwii I was not aware, seems useful, thank you. However, the multi-threading is not implemented by me but by nevergrad. I tried to pose the question as a conceptual one where the code inside the `with` is fixed. — Franco Marchesoni, Jul 06 '20 at 23:23
Yes, they are. But how would you do it? If I define `mydata = threading.local()` then I should do `mydata.local_obj = self.copy()` or something like that? — Franco Marchesoni, Jul 07 '20 at 22:40
Can you change your code so you don't modify rot_algorithm.angle, perhaps by introducing a new variable that is local to the calculation? That would seem to be an easy thing to try compared to these other solutions, which ALSO require you to create a new variable that is modifiable so that you can leave rot_algorithm.angle alone. You can't fix this problem as described without changing the way you use rot_algorithm.angle. — Paul Cornelius, Jul 07 '20 at 23:49
Thank you all, I ended up using `threading.local` as @wwii suggested and I commented afterwards. — Franco Marchesoni, Jul 08 '20 at 19:48

How should I manage a parallel optimization of an instance method using ThreadPoolExecutor?

Useless intro

The problem setup

The question

Use of ThreadPoolExecutor

0 Answers0

Use of `ThreadPoolExecutor`