17

i want to write a memoize function in scala that can be applied to any function object no matter what that function object is. i want to do so in a way that lets me use a single implementation of memoize. i'm flexible about the syntax, but ideally the memoize appears somewhere very close to the declaration of the function as opposed to after the function. i'd also like to avoid first declaring the original function and then a second declaration for the memoized version.

so some ideal syntax might be this:

def slowFunction(<some args left intentionally vague>) = memoize {
  // the original implementation of slow function
}

or even this would be acceptable:

def slowFUnction = memoize { <some args left intentionally vague> => {
  // the original implementation of slow function
}}

i've seen ways to do this where memoize must be redefined for each arity function, but i want to avoid this approach. the reason is that i will need to implement dozens of functions similar to memoize (i.e. other decorators) and it's too much to ask to have to copy each one for each arity function.

one way to do memoize that does require you to repeat memoize declarations (so it's no good) is at What type to use to store an in-memory mutable data table in Scala?.

Community
  • 1
  • 1
Heinrich Schmetterling
  • 6,614
  • 11
  • 40
  • 56
  • To clarify: Do you expect `def foo(x: Int, y: Int) = memoize(x: Int) { factorial(x) * y }` to memoize only the subexpressions of `x` and then do the final computation in `x, y` with the memoized partial result? – Ben Jackson May 03 '11 at 21:30
  • actually no, my expectation was that it would memoize around all the function arguments together. that is, an example would be "def foo(x: Int, y: Int) = memoize { factorial(x) * y) }. – Heinrich Schmetterling May 03 '11 at 22:06
  • I guess I was misled by your "some args" vs "args to memoize" that the memoization args could be a subset – Ben Jackson May 03 '11 at 22:27
  • ah thanks - i removed those for clarity. i meant they could be args to control the memoize function itself, as opposed to corresponding to the original function's args. – Heinrich Schmetterling May 03 '11 at 22:40

4 Answers4

22

You can use a type-class approach to deal with the arity issue. You will still need to deal with each function arity you want to support, but not for every arity/decorator combination:

/**
 * A type class that can tuple and untuple function types.
 * @param [U] an untupled function type
 * @param [T] a tupled function type
 */
sealed class Tupler[U, T](val tupled: U => T, 
                          val untupled: T => U)

object Tupler {
   implicit def function0[R]: Tupler[() => R, Unit => R] =
      new Tupler((f: () => R) => (_: Unit) => f(),
                 (f: Unit => R) => () => f(()))
   implicit def function1[T, R]: Tupler[T => R, T => R] = 
      new Tupler(identity, identity)
   implicit def function2[T1, T2, R]: Tupler[(T1, T2) => R, ((T1, T2)) => R] = 
      new Tupler(_.tupled, Function.untupled[T1, T2, R]) 
   // ... more tuplers
}

You can then implement the decorator as follows:

/**
 * A memoized unary function.
 *
 * @param f A unary function to memoize
 * @param [T] the argument type
 * @param [R] the return type
 */
class Memoize1[-T, +R](f: T => R) extends (T => R) {
   // memoization implementation
}

object Memoize {
   /**
    * Memoize a function.
    *
    * @param f the function to memoize
    */
   def memoize[T, R, F](f: F)(implicit e: Tupler[F, T => R]): F = 
      e.untupled(new Memoize1(e.tupled(f)))
}

Your "ideal" syntax won't work because the compiler would assume that the block passed into memoize is a 0-argument lexical closure. You can, however, use your latter syntax:

// edit: this was originally (and incorrectly) a def
lazy val slowFn = memoize { (n: Int) => 
   // compute the prime decomposition of n
}

Edit:

To eliminate a lot of the boilerplate for defining new decorators, you can create a trait:

trait FunctionDecorator {
   final def apply[T, R, F](f: F)(implicit e: Tupler[F, T => R]): F = 
      e.untupled(decorate(e.tupled(f)))

   protected def decorate[T, R](f: T => R): T => R
}

This allows you to redefine the Memoize decorator as

object Memoize extends FunctionDecorator {
   /**
    * Memoize a function.
    *
    * @param f the function to memoize
    */
   protected def decorate[T, R](f: T => R) = new Memoize1(f)
}

Rather than invoking a memoize method on the Memoize object, you apply the Memoize object directly:

// edit: this was originally (and incorrectly) a def
lazy val slowFn = Memoize(primeDecomposition _)

or

lazy val slowFn = Memoize { (n: Int) =>
   // compute the prime decomposition of n
}
Aaron Novstrup
  • 20,967
  • 7
  • 70
  • 108
  • @aaron thanks, this is a pretty elegant solution. is it possible to add an implicit to the Tupler object in order to support decorating functions with 0 arguments? i was playing around with this and couldn't figure out how to define the type parameters of a function0 since there is only one type R now as the input to the function has no type. – Heinrich Schmetterling May 04 '11 at 10:32
  • 2
    @Heinrich: The language already supports a memoized function with 0 arguments: `val`. :) – Ben Jackson May 04 '11 at 17:35
  • @ben agreed, i'm just not seeing how to combine that with the above scheme such that you only need to implement decorate[T, R] once. any thoughts? – Heinrich Schmetterling May 04 '11 at 18:02
  • @Heinrich Updated the code to handle zero-arg functions. It just required generalizing the Tupler type signature a bit. – Aaron Novstrup May 04 '11 at 19:06
  • @aaron great, thanks. one thing i'm curious about is whether you can reuse an instance of Memoize1 across different invocations of slowFn. that is, once you declare slowFn, it actually creates an instance of Memoize1 each time you call slowFn. it seems like you can get around this by doing "val slowFn" instead of "def slowFn" but i wonder if there's a different way? – Heinrich Schmetterling May 05 '11 at 01:32
  • @Heinrich No, it should really be a `val`; it's intended to define a function rather than a method. For some decorators it may not matter, but for memoization using a `def` defeats the purpose of the decorator. – Aaron Novstrup May 05 '11 at 06:03
  • even better: `lazy val`, this way the value is not calculated until needed. – VasiliNovikov Sep 26 '13 at 09:21
  • This is pretty cool. My biggest problem here is that use lose the descriptive API of a normal method `def` to make memoization transparent, in a way that lets you apply memoization without reference to the actual function context, (like `def myMethod(p1, p2) = Memoize(_*) { ??? }`, assuming _* pulled all args of the caller and passed them through. But I guess without that syntactic invention, this can't work. – acjay May 22 '14 at 19:51
  • @acjay If the computation were part of a class's public API, one could make the memoized function private and then expose it via a method: `private lazy val memoFn = Memoize(fn _); def myMethod(p1: A, p2: B): C = memoFn(p1, p2)` – Aaron Novstrup May 22 '14 at 22:33
  • @AaronNovstrup I'm not sure I understand. It doesn't look like memoFn is using all of the params. Also, where would fn be defined? I guess what I'm also lamenting is a solution that requires no repetition of the parameters. The same way one might be able to use `arguments` in Javascript (although I admit, I think implicit `arguments` feels super clunky).. – acjay May 22 '14 at 23:06
  • @acjay This example assumes `fn` is some existing, slow, two-argument function (perhaps also a private member) -- `memoFn` would be the memoized version of `fn` and `myMethod` would be the public API that delegates to `memoFn`. Both parameters (`p1` and `p2`) are used. Maybe I was misunderstanding your comment about a descriptive API? – Aaron Novstrup May 23 '14 at 18:05
  • @AaronNovstrup Maybe I'm reading it wrong but is the `Memoize(fn _)` syntactically correct? I'm not seeing how the 2 args are passed. But to answer what I'm looking for, I'm saying supposing the method body itself were the slow function, it would be nice to have a way to somehow decorate it for memoization with minimal change in syntax. – acjay May 23 '14 at 18:55
  • @acjay That was confusing on my part -- if `fn` is a two-arg *method*, then the `_` is necessary to "partially apply" it (i.e., turn it into a two-arg function value). If `fn` is already a two-arg function, you could just write `Memoize(fn)`. – Aaron Novstrup May 23 '14 at 22:04
  • 1
    @acjay I agree that it's somewhat unfortunate that this requires changing your method to a function value (or at least introducing a function value to delegate to), but it's an unavoidable consequence of the fact that `Memoize(f)` constructs a new function value / lookup table each time it's called. It's essential that we only call it once (per function we memoize) and store the resulting function value in a `val` or `lazy val`, or the benefit of memoization would be lost. – Aaron Novstrup May 23 '14 at 22:28
4

Library

Use Scalaz's scalaz.Memo

Manual

Below is a solution similar to Aaron Novstrup's answer and this blog, except with some corrections/improvements, brevity and easier for peoples copy and paste needs :)

import scala.Predef._

class Memoized[-T, +R](f: T => R) extends (T => R) {

  import scala.collection.mutable

  private[this] val vals = mutable.Map.empty[T, R]

  def apply(x: T): R = vals.getOrElse(x, {
      val y = f(x)
      vals += ((x, y))
      y
    })
}

// TODO Use macros
// See si9n.com/treehugger/
// http://stackoverflow.com/questions/11400705/code-generation-with-scala
object Tupler {
  implicit def t0t[R]: (() => R) => (Unit) => R = (f: () => R) => (_: Unit) => f()

  implicit def t1t[T, R]: ((T) => R) => (T) => R = identity

  implicit def t2t[T1, T2, R]: ((T1, T2) => R) => ((T1, T2)) => R = (_: (T1, T2) => R).tupled

  implicit def t3t[T1, T2, T3, R]: ((T1, T2, T3) => R) => ((T1, T2, T3)) => R = (_: (T1, T2, T3) => R).tupled

  implicit def t0u[R]: ((Unit) => R) => () => R = (f: Unit => R) => () => f(())

  implicit def t1u[T, R]: ((T) => R) => (T) => R = identity

  implicit def t2u[T1, T2, R]: (((T1, T2)) => R) => ((T1, T2) => R) = Function.untupled[T1, T2, R]

  implicit def t3u[T1, T2, T3, R]: (((T1, T2, T3)) => R) => ((T1, T2, T3) => R) = Function.untupled[T1, T2, T3, R]
}

object Memoize {
  final def apply[T, R, F](f: F)(implicit tupled: F => (T => R), untupled: (T => R) => F): F =
    untupled(new Memoized(tupled(f)))

  //I haven't yet made the implicit tupling magic for this yet
  def recursive[T, R](f: (T, T => R) => R) = {
    var yf: T => R = null
    yf = Memoize(f(_, yf))
    yf
  }
}

object ExampleMemoize extends App {

  val facMemoizable: (BigInt, BigInt => BigInt) => BigInt = (n: BigInt, f: BigInt => BigInt) => {
    if (n == 0) 1
    else n * f(n - 1)
  }

  val facMemoized = Memoize1.recursive(facMemoizable)

  override def main(args: Array[String]) {
    def myMethod(s: Int, i: Int, d: Double): Double = {
      println("myMethod ran")
      s + i + d
    }

    val myMethodMemoizedFunction: (Int, Int, Double) => Double = Memoize(myMethod _)

    def myMethodMemoized(s: Int, i: Int, d: Double): Double = myMethodMemoizedFunction(s, i, d)

    println("myMemoizedMethod(10, 5, 2.2) = " + myMethodMemoized(10, 5, 2.2))
    println("myMemoizedMethod(10, 5, 2.2) = " + myMethodMemoized(10, 5, 2.2))

    println("myMemoizedMethod(5, 5, 2.2) = " + myMethodMemoized(5, 5, 2.2))
    println("myMemoizedMethod(5, 5, 2.2) = " + myMethodMemoized(5, 5, 2.2))

    val myFunctionMemoized: (Int, Int, Double) => Double = Memoize((s: Int, i: Int, d: Double) => {
      println("myFunction ran")
      s * i + d + 3
    })

    println("myFunctionMemoized(10, 5, 2.2) = " + myFunctionMemoized(10, 5, 2.2))
    println("myFunctionMemoized(10, 5, 2.2) = " + myFunctionMemoized(10, 5, 2.2))

    println("myFunctionMemoized(7, 6, 3.2) = " + myFunctionMemoized(7, 6, 3.2))
    println("myFunctionMemoized(7, 6, 3.2) = " + myFunctionMemoized(7, 6, 3.2))
  }
}

When you run ExampleMemoize you will get:

myMethod ran
myMemoizedMethod(10, 5, 2.2) = 17.2
myMemoizedMethod(10, 5, 2.2) = 17.2
myMethod ran
myMemoizedMethod(5, 5, 2.2) = 12.2
myMemoizedMethod(5, 5, 2.2) = 12.2
myFunction ran
myFunctionMemoized(10, 5, 2.2) = 55.2
myFunctionMemoized(10, 5, 2.2) = 55.2
myFunction ran
myFunctionMemoized(7, 6, 3.2) = 48.2
myFunctionMemoized(7, 6, 3.2) = 48.2
samthebest
  • 30,803
  • 25
  • 102
  • 142
  • I have a question about extending this technique to a multithreaded environment here: http://stackoverflow.com/q/24320209/807674 – acjay Jun 20 '14 at 04:56
2

I was thinking that you could do something like this and than use a DynamicProxy for the actual implementation.

def memo[T<:Product, R, F <: { def tupled: T => R }](f: F )(implicit m: Manifest[F]):F

The idea being that becuase functions lack a common super type we use a structural type to find anything that can be tupled (Function2-22, you still need to special case Function1).

I throw the Manifest in there so you can construct the DynamicProxy from the function trait that is F

Tupling should also help with the memoization as such as you simple put the tuple in a Map[T,R]

John Nilsson
  • 17,001
  • 8
  • 32
  • 42
1

This works because K can be a tuple type so memo(x,y,z) { function of x, y, z } works:

import scala.collection.mutable

def memo[K,R](k: K)(f: => R)(implicit m: mutable.Map[K,R]) = m.getOrElseUpdate(k, f)

The implicit was the only way I could see to bring in the map cleanly:

implicit val fibMap = new mutable.HashMap[Int,Int]
def fib(x: Int): Int = memo(x) {
    x match {
        case 1 => 1
        case 2 => 1
        case n => fib(n - 2) + fib(n - 1)
    }
}

It feels like it should be possible to somehow wrap up an automatic HashMap[K,R] so that you don't have to make fibMap (and re-describe the type) explicitly.

Ben Jackson
  • 90,079
  • 9
  • 98
  • 150
  • in an ideal world, you wouldn't have to pass any arguments to memo just for simplicity's sake. that is, it's kind of repeating information that is already provided by fib. – Heinrich Schmetterling May 04 '11 at 00:06
  • what is the name of the mechanism that lets you call memo(x, y, z) and have that come in as a single tuple? – Heinrich Schmetterling May 04 '11 at 00:23
  • I believe it's actually a quirk of the parser, as explained here: http://stackoverflow.com/questions/2850902/scala-coalesces-multiple-function-call-parameters-into-a-tuple-can-this-be-dis – Ben Jackson May 04 '11 at 00:32