11

I'm writing a function to find triangle numbers and the natural way to write it is recursively:

function triangle (x)
   if x == 0 then return 0 end
   return x+triangle(x-1)
end

But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.

Jon 'links in bio' Ericson
  • 20,880
  • 12
  • 98
  • 148
  • 2
    Why is everyone writing complex, slow recursive functions? It is just n*(n-1)/2. See my answer below. It's in java, not Lua, but is still gets the idea across. – Fractaly Dec 06 '11 at 04:45
  • 1
    @Fractaly: Very true. The `triangle` function, however, is _not_ the thrust of the question, which is really asking about a slightly more esoteric concept. Still, that's the best solution to the problem. So good, in fact, that it's already been [suggested](http://stackoverflow.com/a/129924/1438). – Jon 'links in bio' Ericson Dec 06 '11 at 05:41
  • Thanks for pointing that out. There were a lot of answers, and I guess I missed it. It just seems like a bad example case for memoization, since it doesn't actually require it. – Fractaly Dec 07 '11 at 04:39
  • @Fractaly: I was young and foolish then... – Jon 'links in bio' Ericson Dec 07 '11 at 16:42
  • No problem. Is this for the Euler challenges, by any chance? This question really caught my attention because you have the same last name as my science teacher. Just a coincidence, I suppose. – Fractaly Dec 08 '11 at 01:20
  • @Fractaly: It's problem #12 in fact. (I should go back to attempting solutions to those...) The names are probably a coincidence unless you happen to go to GMU. ;-) – Jon 'links in bio' Ericson Dec 08 '11 at 19:18
  • Full generic scala answer [here][1] [1]: http://stackoverflow.com/questions/5875767/scala-memoize-a-function-no-matter-how-many-arguments-the-function-takes/19065888#19065888 – samthebest Sep 28 '13 at 10:29
  • See [this blog post](http://www.uncarved.com/blog/memoization.mrk) for a generic Scala solution, up to 4 arguments. – thSoft Oct 29 '09 at 11:56

15 Answers15

9

Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:

triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]

That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.

Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:

fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]

Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html

I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!

dreeves
  • 26,430
  • 45
  • 154
  • 229
6

You're also asking the wrong question for your original problem ;)

This is a better way for that case:

triangle(n) = n * (n - 1) / 2

Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.

Community
  • 1
  • 1
Luke Halliwell
  • 7,312
  • 6
  • 31
  • 31
5

I bet something like this should work with variable argument lists in Lua:

local function varg_tostring(...)
    local s = select(1, ...)
    for n = 2, select('#', ...) do
        s = s..","..select(n,...)
    end
    return s
end

local function memoize(f)
    local cache = {}
    return function (...)
        local al = varg_tostring(...)
        if cache[al] then
            return cache[al]
        else
            local y = f(...)
            cache[al] = y
            return y
        end
    end
end

You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.

Jon 'links in bio' Ericson
  • 20,880
  • 12
  • 98
  • 148
Lee Baldwin
  • 1,560
  • 2
  • 12
  • 10
  • Good work! I haven't looked at variable argument list in Lua yet, so this is a great example. – Jon 'links in bio' Ericson Sep 26 '08 at 21:02
  • Is there a way to convert args into a value more efficiently than converting to a string? – Aaron Sep 27 '08 at 22:38
  • NOTE: you need to escape ',' characters in the string 's' -- otherwise memoize of f("1", "2,3") will return the same value as f("1,2", "3"), even if the two functions return different results. Which would be bad. – Aaron Sep 27 '08 at 22:39
  • 1
    It could be done as a N dimension array, which would solve the comma issue, but the cache access might be less efficient. Mathematical and recursive functions are the best candidates for memoization, so I don't think these are huge issues. – Jon 'links in bio' Ericson Sep 28 '08 at 05:44
  • 1
    you should add an option to make the cache a weak table (weak keys and values), so the cache can get cleaned once in a while, and avoid memory bloating – Robert Gould Dec 12 '08 at 10:21
  • 1
    Using string concatenation in the varg_tostring function really is a performance buster, creating new strings at every concat. better would be to simply use table.concat({...},"whateverSeperatorYou'dWant") instead. – jpjacobs Jan 10 '11 at 17:54
5

In C# 3.0 - for recursive functions, you can do something like:

public static class Helpers
{
    public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>,  R> f)
    {
        var map = new Dictionary<A, R>();
        Func<A, R> self = null;
        self = (a) =>
        {
            R value;
            if (map.TryGetValue(a, out value))
                return value;
            value = f(a, self);
            map.Add(a, value);
            return value;
        };
        return self;
    }        
}

Then you can create a memoized Fibonacci function like this:

var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
Amir
  • 4,131
  • 26
  • 36
4

In Scala (untested):

def memoize[A, B](f: (A)=>B) = {
  var cache = Map[A, B]()

  { x: A =>
    if (cache contains x) cache(x) else {
      val back = f(x)
      cache += (x -> back)

      back
    }
  }
}

Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:

val correctMem = memoize(memoize _)

I don't think that this will compile, but it does illustrate the idea.

Daniel Spiewak
  • 54,515
  • 14
  • 108
  • 120
4

Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.

I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.

Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.

If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.

Community
  • 1
  • 1
Chris Ammerman
  • 14,978
  • 8
  • 41
  • 41
  • Actually, this function ought to be memoized even if tail recursion is in effect. To convince yourself, imagine calling it twice with two very large numbers. The second call will be much faster if the results of the first are cached. – Jon 'links in bio' Ericson Sep 24 '08 at 21:50
  • Are there still languages who *don't* implement tail call optimization ?? I thought those Microsoft guys did F#, they must have implemented it everywhere. – Alexandre C. Nov 14 '10 at 12:23
3
function memoize (f)
   local cache = {}
   return function (x)
             if cache[x] then
                return cache[x]
             else
                local y = f(x)
                cache[x] = y
                return y
             end
          end
end

triangle = memoize(triangle);

Note that to avoid a stack overflow, triangle would still need to be seeded.

Jon 'links in bio' Ericson
  • 20,880
  • 12
  • 98
  • 148
2

Here's something that works without converting the arguments to strings. The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.

local function m(f)
  local t = { }
  local function mf(x, ...) -- memoized f
    assert(x ~= nil, 'nil passed to memoized function')
    if select('#', ...) > 0 then
      t[x] = t[x] or m(function(...) return f(x, ...) end)
      return t[x](...)
    else
      t[x] = t[x] or f(x)
      assert(t[x] ~= nil, 'memoized function returns nil')
      return t[x]
    end
  end
  return mf
end
Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
2

I've been inspired by this question to implement (yet another) flexible memoize function in Lua.

https://github.com/kikito/memoize.lua

Main advantages:

  • Accepts a variable number of arguments
  • Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
  • Works just fine with functions that return multiple values.

Pasting the code here as reference:

local globalCache = {}

local function getFromCache(cache, args)
  local node = cache
  for i=1, #args do
    if not node.children then return {} end
    node = node.children[args[i]]
    if not node then return {} end
  end
  return node.results
end

local function insertInCache(cache, args, results)
  local arg
  local node = cache
  for i=1, #args do
    arg = args[i]
    node.children = node.children or {}
    node.children[arg] = node.children[arg] or {}
    node = node.children[arg]
  end
  node.results = results
end


-- public function

local function memoize(f)
  globalCache[f] = { results = {} }
  return function (...)
    local results = getFromCache( globalCache[f], {...} )

    if #results == 0 then
      results = { f(...) }
      insertInCache(globalCache[f], {...}, results)
    end

    return unpack(results)
  end
end

return memoize
kikito
  • 51,734
  • 32
  • 149
  • 189
1

Here is a generic C# 3.0 implementation, if it could help :

public static class Memoization
{
    public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
    {
        var cache = new Dictionary<T, TResult>();
        var nullCache = default(TResult);
        var isNullCacheSet = false;
        return  parameter =>
                {
                    TResult value;

                    if (parameter == null && isNullCacheSet)
                    {
                        return nullCache;
                    }

                    if (parameter == null)
                    {
                        nullCache = function(parameter);
                        isNullCacheSet = true;
                        return nullCache;
                    }

                    if (cache.TryGetValue(parameter, out value))
                    {
                        return value;
                    }

                    value = function(parameter);
                    cache.Add(parameter, value);
                    return value;
                };
    }
}

(Quoted from a french blog article)

Romain Verdier
  • 12,833
  • 7
  • 57
  • 77
1

In the vein of posting memoization in different languages, i'd like to respond to @onebyone.livejournal.com with a non-language-changing C++ example.

First, a memoizer for single arg functions:

template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
    template <class F>
    const Result& operator()(F f, const Arg& a){
        typename ResultStore::const_iterator it = memo_.find(a);
        if(it == memo_.end()) {
            it = memo_.insert(make_pair(a, f(a))).first;
        }
        return it->second;
    }
private:
    ResultStore memo_;
};

Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).

Next, a driver functon, and an implementation. only the driver function need be public int fib(int); // driver int fib_(int); // implementation

Implemented:

int fib_(int n){
    ++total_ops;
    if(n == 0 || n == 1) 
        return 1;
    else
        return fib(n-1) + fib(n-2);
}

And the driver, to memoize

int fib(int n) {
    static memoizer1<int,int> memo;
    return memo(fib_, n);
}

Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)

This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.

Aaron
  • 3,454
  • 23
  • 26
1

In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.

The example from it's manpage:

# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments);    # Is faster than it was before

You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.

Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!

Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html

Hercynium
  • 929
  • 9
  • 18
0

Extending the idea, it's also possible to memoize functions with two input parameters:

function memoize2 (f)
   local cache = {}
   return function (x, y)
             if cache[x..','..y] then
                return cache[x..','..y]
             else
                local z = f(x,y)
                cache[x..','..y] = z
                return z
             end
          end
end

Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.

But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.

function gcd (a, b) 
   if b == 0 then return a end
   return gcd(b, a%b)
end

As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.

Jon 'links in bio' Ericson
  • 20,880
  • 12
  • 98
  • 148
  • Couldn't you use a vararg in the closure returned by the memoize function? In Lua, you can do things like t = {...} to pack variable argument list into a table, or directly call a function and pass the f(...). Then just pack the vararg list to string to use as the cache index. – Lee Baldwin Sep 26 '08 at 19:44
  • NOTE: this will break if arguments contain ',' comma when converted to string. eg, f("1", "2,3") will evaluate same as f("1,2", "3"), even if that is the incorrect result. – Aaron Sep 27 '08 at 22:46
0

Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.

int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
  sum+=i;
  memo[i] = sum;
}
return memo[n];
arviman
  • 5,087
  • 41
  • 48
0

Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...

public int triangle(final int n){
   return n * (n - 1) / 2;
}
Fractaly
  • 834
  • 2
  • 10
  • 25