7

I am solving a problem in three different ways, two are recursive and I memoize them myself. The other is not recursive but uses math.factorial. I need to know if I need to add explicit memoization to it.

Thanks.

Paddy3118
  • 4,704
  • 27
  • 38
  • you can check it yourself by running `math.factorial` twice with a large enough number, e.g. n=50000 – orip Feb 12 '11 at 06:53
  • if you are interested in different implementation of factorial in python, this is a nice article: http://metaleks.net/programming/the-evolution-of-a-python-programmer – Asterisk Feb 12 '11 at 07:01

4 Answers4

4

Search for math_factorial on this link and you will find its implementation in python:

http://svn.python.org/view/python/trunk/Modules/mathmodule.c?view=markup

P.S. This is for python2.6

Asterisk
  • 3,534
  • 2
  • 34
  • 53
4

Python's math.factorial is not memoized, it is a simple for loop multiplying the values from 1 to your arg. If you need memoization, you need to do it explicitly.

Here is a simple way to memoize using dictionary setdefault method.

import math
cache = {}
def myfact(x):
    return cache.setdefault(x,math.factorial(x))
print myfact(10000)
print myfact(10000)
Senthil Kumaran
  • 54,681
  • 14
  • 94
  • 131
  • Thanks Senthil. I was able to confirm this from the source. – Paddy3118 Feb 12 '11 at 09:47
  • @Paddy3118 - Added a simple example. It does not use decorator, but just uses dictionary's setdefault method which can useful for such purposes. – Senthil Kumaran Feb 12 '11 at 10:34
  • Thanks again @Senthil. I am using a less flexible version of the code from [here](http://pypi.python.org/pypi/decorator/3.1.2#statement-of-the-problem), that I can use to decorate three [functions](http://rosettacode.org/wiki/Catalan_numbers#Python). – Paddy3118 Feb 12 '11 at 13:24
  • 8
    This does not achieve any performance benefits. `setdefault`, like all Python methods, always evaluates all its arguments - it does not shortcircuit when the key is already present in the dictionary. Instead, you need to either use `result = cache.get(x)` and test for None or `if x in cache: ...` I've made that same mistake myself. `cache.get(x, math.factorial(x))` is no better either. – Peter DeGlopper Oct 11 '13 at 05:01
  • I think @Peter must be right. Unfortunately I can't change my upvote. :-) For another way of memoizing a function, see http://www.python-course.eu/python3_memoization.php – LarsH Jul 29 '14 at 04:30
  • @PeterDeGlopper you are right. In addition defining the memoized factorial this way does not help very much. If you call myfact(1000) the result gets memoized. But if you call myfact(999) soon after that, the result is calculated not taking into account that myfact(1000) already had computed myfact(999). The only way to achieve a good perfomance improvement is to redefine ex-novo a memoized recursive factorial function. – Domenico De Felice Nov 29 '14 at 11:47
  • (continuing from my last comment) Actually implementing it in a recursive way can soon lead to excessive recursion depth. Better to implement it in an iterative way. – Domenico De Felice Nov 29 '14 at 12:19
3

Python's math.factorial is not memoized.

I'm going to guide you through some trial and error examples to see why to get a really memoized and working factorial function you have to redefine it ex-novo taking into account a couple of things.

The other answer actually is not correct. Here,

import math
cache = {}
def myfact(x):
    return cache.setdefault(x,math.factorial(x))

the line

return cache.setdefault(x,math.factorial(x))

computes both x and math.factorial(x) every time and therefore you gain no performance improvement.

You may think of doing something like this:

if x not in cache:
    cache[x] = math.factorial(x)
return cache[x]

but actually this is wrong as well. Yes, you avoid computing again the factorial of a same x but think, for example, if you are going to calculate myfact(1000) and soon after that myfact(999). Both of them gets calculated completely thus not taking any advantage from the fact that myfact(1000) automatically computes myfact(999).

It comes natural then to write something like this:

def memoize(f):
    """Returns a memoized version of f"""
    memory = {}
    def memoized(*args):
        if args not in memory:
            memory[args] = f(*args)
        return memory[args]
    return memoized

@memoize
def my_fact(x):
    assert x >= 0
    if x == 0:
        return 1
    return x * my_fact(x - 1)

This is going to work. Unfortunately it soon reaches the maximum recursion depth.

So how to implement it?

Here is an example of truly memoized factorial, that takes advantage of how factorials work and does not consumes all the stack with recursive calls:

# The 'max' key stores the maximum number for which the factorial is stored.
fact_memory = {0: 1, 1: 1, 'max': 1}

def my_fact(num):
    # Factorial is defined only for non-negative numbers
    assert num >= 0

    if num <= fact_memory['max']:
        return fact_memory[num]

    for x in range(fact_memory['max']+1, num+1):
        fact_memory[x] = fact_memory[x-1] * x
    fact_memory['max'] = num
    return fact_memory[num]

I hope you find this useful.

EDIT:

Just as a note, a way to achieve this same optimization having at the same time the conciseness and elegance of recursion would be to redefine the function as a tail-recursive function.

def memoize(f):
    """Returns a memoized version of f"""
    memory = {}
    def memoized(*args):
        if args not in memory:
            memory[args] = f(*args)
        return memory[args]
    return memoized

@memoize
def my_fact(x, fac=1):
    assert x >= 0
    if x < 2:
        return fac
    return my_fact(x-1,  x*fac)

Tail recursion functions in fact can be recognized by the interpreter/compiler and be automagically translated/optimized to an iterative version, but not all interpreters/compilers support this.

Unfortunately python does not support tail recursion optimization, so you still get:

RuntimeError: maximum recursion depth exceeded

when the input of my_fact is high.

  • You want to omit the recursion and use a loop. – Paddy3118 Dec 01 '14 at 14:37
  • Can you rephrase that, please? Are you asking if it is possible to do the same with a recursive definition of the function? If that's the question, it would be possible by redefining the function in a tail-recursive way as long as the interpreter you are using supports this. Tail-recursive function can be in fact recognized by the compiler/interpreter and be translated/optimized to an iterative algorithm. Unfortunately as you can see from my last edit standard python interpreter 2.7.3 does not seem to support this. – Domenico De Felice Dec 03 '14 at 08:23
  • There are many decorators that perform pseudo-optimization of tail-calls in Python, that is they prevent the call-stack from overflowing. – Eli Korvigo Nov 12 '15 at 13:41
  • @EliKorvigo interesting. Could you provide more info about it? – Domenico De Felice Nov 12 '15 at 18:34
  • 1
    @DomenicoDeFelice Sure, I personally prefer the recur.tcodecorator from package fn for several reasons, though it has some limitations (namely, your function must be modified in a special way). I've seen many decorators that work with almost any function. [This one](http://code.activestate.com/recipes/474088-tail-call-optimization-decorator/) is an example (might get tricky with nested functions), and there are many other examples given in the comments. – Eli Korvigo Nov 13 '15 at 04:15
  • @DomenicoDeFelice My pleasure. I've posted another implementation of a memoized factorial function. You might want to look at it. – Eli Korvigo Nov 16 '15 at 11:57
1

I'm late to the party, yet here are my 2c on implementing an efficient memoized factorial function in Python. This approach is more efficient since it relies on an array-like structure (that is list) rather than a hashed container (that is dict). No recursion involved (spares you some Python function-call overhead) and no slow for-loops involved. And it is (arguably) functionally-pure as there are no outer side-effects involved (that is it doesn't modify a global variable). It caches all intermediate factorials, hence if you've already calculated factorial(n), it will take you O(1) to calculate factorial(m) for any 0 <= m <= n and O(m-n) for any m > n.

def inner_func(f):
    return f()


@inner_func
def factorial():
    factorials = [1]

    def calculate_factorial(n):
        assert n >= 0
        return reduce(lambda cache, num: (cache.append(cache[-1] * num) or cache),
                      xrange(len(factorials), n+1), factorials)[n]

    return calculate_factorial
Eli Korvigo
  • 10,265
  • 6
  • 47
  • 73
  • Nice solution. The only things I don't like are: 1) the `or cache` relying on the knowledge that `append` returns `None` (something more explicit would be better and clearer to understand), 2) I don't know what you mean with "slow for-loops", but, if we only care about performance, I don't think `reduce` is faster: it is doing a function call to the lambda at every step (unless the interpreter does some optimization that inlines the lambda code) – Domenico De Felice Nov 20 '15 at 20:50