2

Let's say I have a package with the following function:

foo <- function() {
  Sys.sleep(1) # really expensive operation
  return(1)
}

The value of the function is always the same per run, so I would like to use memoisation.

I thought I could simply do

foo <- memoise::memoise(function() {
  Sys.sleep(1) # really expensive operation
  return(1)
})

However, this doesn't work.

I mean, running it as a GlobalEnv function, it works:

foo <- memoise::memoise(function() {
  Sys.sleep(1)
  return(1)
})

system.time(foo())
#>    user  system elapsed 
#>       0       0       1

system.time(foo())
#>    user  system elapsed 
#>    0.01    0.00    0.01

Created on 2019-12-23 by the reprex package (v0.3.0)

However, if it's in a package, I get really weird behavior. Basically, memoisation doesn't kick in and I keep getting the same cost. However, if I print the function definition, it starts working!

system.time(bar::foo())
#>    user  system elapsed 
#>    0.47    0.08    2.55

system.time(bar::foo())
#>    user  system elapsed 
#>       0       0       2

system.time(bar::foo())
#>    user  system elapsed 
#>    0.02    0.00    2.02

system.time(bar::foo())
#>    user  system elapsed 
#>    0.01    0.00    2.02

bar::foo
#> Memoised Function:
#> function() {
#>   Sys.sleep(2)
#>   return (1)
#> }
#> <environment: namespace:bar>

system.time(bar::foo())
#>    user  system elapsed 
#>       0       0       2

system.time(bar::foo())
#>    user  system elapsed 
#>       0       0       0

system.time(bar::foo())
#>    user  system elapsed 
#>       0       0       0

system.time(bar::foo())
#>    user  system elapsed 
#>       0       0       0

For the record, here are the relevant parts of the NAMESPACE and DESCRIPTION files:

# NAMESPACE
export(foo)
importFrom(memoise,memoise)

# DESCRIPTION [...]
Imports:
    memoise

What's going on here, and what should I do to make memoisation work from the start in my package?

Wasabi
  • 2,879
  • 3
  • 26
  • 48

1 Answers1

1

This looks like a bug in the memoise package. When you are working on your own package, R may add debug information (called srcrefs) to functions. Something about those cause the hash to come out differently every time you call the function, so it never recognizes that you are calling with the same arguments.

A simple workaround is to remove the install option "--with-keep.source" when you install your own package. (If you're using RStudio, this is added automatically in Project Options | Build Tools | Install and Restart... .) This will stop R from adding the srcref, and the bug in memoise won't be triggered. Unfortunately, this cripples the debugger in RStudio and other front-ends, so it's not ideal.

Another workaround that doesn't mess with the debugger (except for that one function) is to use removeSource on the target that is being memoised. For example,

foo <- memoise::memoise(removeSource(function() {
  Sys.sleep(1) # really expensive operation
  return(1)
}))
user2554330
  • 37,248
  • 4
  • 43
  • 90
  • 1
    I just noticed something in the `?memoise` help page: it recommends doing the memoization in `.onLoad`, not in the source. That's probably another workaround. – user2554330 Dec 23 '19 at 21:13
  • 1
    Interestingly, the `memoise` Github has a [modified documentation](https://github.com/r-lib/memoise/blob/58d39726de141fefd235557a33e6478f76b0ad7f/R/memoise.R#L35) which suggests using `.onLoad` to perform the actual memoisation. The [reasoning for it](https://github.com/r-lib/memoise/issues/76) seems to be more related to build-time dependencies, though. – Wasabi Dec 23 '19 at 21:13