How to use a shared variable between parallel instances of a function using ray

Question

I have a shared variable, called temp, which is supposed to be constantly read and manipulated by various instances of a function called do(). In particular, using the ray module, I decorate do() as below.

@ray.remote
def do(temp):
    prob1, prob2 = compute_probability(tau1, tau2)
    selected_path = select_path(prob1, prob2)
    if selected_path == 1:
        temp += 1
    update_accumulation(selected_path)

Then, in my main loop, I just call all instances of do() like:

temp_id = ray.put(temp)
ray.get([do.remote(temp_id) for _ in range(N)])

However, in the following working code, the value of temp in each iteration of loop is always 0. Can one address where my mistake is?

import random
import matplotlib.pyplot as plt
import ray

N = 500
l1 = 1
l2 = 2
ru = 0.5
Q = 1
tau1 = 0.5
tau2 = 0.5

epochs = 150

success = [0 for x in range(epochs)]

def compute_probability(tau1, tau2):
    return tau1/(tau1 + tau2), tau2/(tau1 + tau2)

def select_path(prob1, prob2):
    return random.choices([1,2], weights=[prob1, prob2])[0]

def update_accumulation(link_id):
    global tau1
    global tau2
    if link_id == 1:
        tau1 += Q / l1
        return tau1
    if link_id == 2:
        tau2 += Q / l2
        return tau2

def update_evapuration():
    global tau1
    global tau2
    tau1 *= (1-ru)
    tau2 *= (1-ru)
    return tau1, tau2

def report_results(success):
    plt.plot(success)
    plt.show()

ray.init()

@ray.remote
def do(temp):
    prob1, prob2 = compute_probability(tau1, tau2)
    selected_path = select_path(prob1, prob2)
    if selected_path == 1:
        temp += 1
    update_accumulation(selected_path)

for epoch in range(epochs-1):
    temp = 0
    temp_id = ray.put(temp)
    ray.get([do.remote(temp_id) for _ in range(N)])
    update_evapuration()
    success[epoch] = temp

report_results(success)

you may need use `ray.get(temp_id)` to get value. But I don't know if you can put back value in the same place - ie. `ray.put( ray.get(temp_id) + 1 )`. Other problem is that you create `temp` and `ray.put(temp)` in loop so every epoch will use own `temp` which will not shared with other `epoch` — furas, Dec 28 '20 at 01:28
I found [How to use global variables with Ray](https://stackoverflow.com/questions/62640533/how-to-use-global-variables-with-ray) but I didn't test it. — furas, Dec 28 '20 at 02:18
@furas: Thanks for sharing your thoughts and the link. Indeed, each `epoch` has its independent `temp`, and various `temp`s are not supposed to communicate. However, Each `do()` of an epoch seemingly works with its own `temp`, while that `temp` has to be shared between all `do()`s of that epoch. — User, Dec 28 '20 at 14:19
I don't have experience with `Ray` - I tried to test code from link but it doesn't work as I expect. In `threading` it could be simpler because they use the same memory. In `multiprocessing` you could use [shared memory](https://docs.python.org/3/library/multiprocessing.shared_memory.html) — furas, Dec 28 '20 at 20:44

score 1 · Answer 1 · answered Jan 04 '21 at 21:16

Ray relies on a patched version of cloudpickle to pickle functions. This means that global variables are serialized with the function, thus not shared between calls to the function.

The recommended way of dealing with this issue is with an Actor. See this post for more details.

How to use a shared variable between parallel instances of a function using ray

1 Answers1