1

I'm using rpy2 to wrap R libraries (modules in Python speak) within python, through the importr function provided by this module.

The issue is that importr can be very expensive at runtime (it does a number of things when invoked) and I'd like to have it called just once (for each importr call I have several functions using its result, and at the same time I can't just put everything on top of the module, or it would slow import time significantly).

Currently, for each module where I use importr I do:

myrlib = None

def do_stuff_with_r(param):
    global myrlib
    if myrlib is None:
        myrlib = importr(myrlib)

I'd like to generalize it since I do this kind of operation in many different modules and thus these lines are duplicated all over.

However I'm not sure how to do this: this solution returns None after the first invocation, which is not really what I'd like to do. Assuming this is doable, how do I ensure importr() for a specific argument is called just once?

Community
  • 1
  • 1
Einar
  • 4,727
  • 7
  • 49
  • 64
  • note: Python `import` statement uses `sys.modules` i.e., if `myrlib` is a Python module (implemented however you like (pure Python, C, Fortran, etc)) then multiple imports of the same module are fast. – jfs Oct 06 '14 at 13:40

1 Answers1

2

You could write your own wrapper function that caches the result of the import:

def import_r(lib, cache={}):
    if lib not in cache:
        cache[lib] = importr(lib)
    return cache[lib]

And use that every time you want want to use importr instead?

Ben
  • 6,687
  • 2
  • 33
  • 46
  • Wanted to, but I had to wait until the timer ran out. ;) – Einar Oct 06 '14 at 08:31
  • @Einar: to avoid unnecessary lookups `ret = cache.get(lib) \n if ret is None: ret = cache[lib] = importr(lib); \n return ret`. The general (perhaps slower) version is `functools.lru_cache()` decorator. – jfs Oct 06 '14 at 13:43