3

My goal is to make the code below execute in roughly 0.3 instead of 0.5 seconds. I've tried using the decorators from functools.lru_cache, toolz.functoolz.memoize and kids.cache.cache on foo but none of those has worked (either error message or didn't correctly execute). What can I do to make this work?

import ray


@ray.remote
class Foo:
    def foo(self, x):
        print("executing foo with {}".format(x))
        time.sleep(0.1)


ray.init()
f = Foo.remote()
s = time.time()
ray.get([f.foo.remote(x=i) for i in [1, 2, 1, 4, 1]])
print(time.time()-s)
ray.shutdown()
Daniel
  • 172
  • 1
  • 11

1 Answers1

5

General warning: Caching arbitrary function calls can be dangerous if the function produces side effects.

In this case, presumably you want the program to output

executing foo with 1 
executing foo with 2 
executing foo with 4 

Those other cache tools you mentioned don't tend to work well with Ray because they try to store the cache in global state of some sort, and they aren't storing that state in a place that can be accessed in a distributed way. Since you already have an actor, you could just store your global state in the actor.

@ray.remote
class Foo:
    def __init__(self):
        self.foo_cache = {}

    def foo(self, x):
        def real_foo(x):
            print("executing foo with {}".format(x))
            time.sleep(0.1)
        if x not in self.foo_cache:
            self.foo_cache[x] = real_foo(x)
        return self.foo_cache[x]

This is a pretty generic caching technique, the only important difference here was that we had to store our state in the actor.

More generalized approach

We can also generalize this approach for any Ray function by defining a general purpose function cache:

@ray.remote
class FunctionCache:
    def __init__(self, func):
        self.func = ray.remote(func)
        self.cache = {}

    def call(self, *args, **kwargs):
        if (args, kwargs) not in cache:
            cache[(args, kwargs)] = self.func(*args, **kwargs)
        return cache[(args, kwargs)]

Then to clean up the way we use it we can define a decorator:

class RemoteFunctionLookAlike:
    def __init__(self, func):
        self.func = func

    def remote(self, *args, **kwargs):
        return self.func(*args, **kwargs)


def ray_cache(func):
    cache = FunctionCache.remote(func)
    def call_with_cache(*args, **kwargs):
        return cache.call.remote(*args, **kwargs)
    return RayFunctionLookAlike(call_with_cache)

Finally, to use this cache:

@ray_cache
def foo(x):
    print("Function called!")
    return abc

ray.get([foo.remote("constant") for _ in range(100)]) # Only prints "Function called!" once.
Alex
  • 1,388
  • 1
  • 10
  • 19
  • 3
    Thank you very much! I'm using the first, simple solution for now. Instead of a dictionary I'm using ```pylru.lrucache```, it's working really well. – Daniel Aug 13 '20 at 09:22