6

Given log(a) and log(b), I want to compute log(a+b) (in a numerically stable way).

I wrote a little function for this:

def log_add(logA,logB):
    if logA == log(0):
        return logB
    if logA<logB:
        return log_add(logB,logA)
    return log( 1 + math.exp(logB-logA) ) + logA

I wrote a program where this is by far the most time-consuming piece of code. Obviously I could try to optimize it (eliminate the recursive call, for instance).

Do you know of a standard math or numpy function for computing log(a+b) from log(a) and log(b)?

If not, do you know of a simple way to make a single C++ hook for this function? It's not a complicated function (it uses floats), and as I said, it's taking up the majority of my runtime.

Thanks in advance, numerical methods ninja!

user
  • 7,123
  • 7
  • 48
  • 90
  • 1
    For me, `log(0)` gives `ValueError: math domain error` what are you trying to use it for? Does it actually work? – agf Sep 20 '11 at 06:40
  • 1
    Micro-optimizations: possibly use `math.log1p`, `log1p(exp(logB - logA))`, rebind `log` and `exp` to the local scope `def log_add(logA, logB, log1p = math.log1p, exp = math.exp):` – agf Sep 20 '11 at 06:42
  • possible duplicate of : http://stackoverflow.com/questions/3974793/to-compute-logab – DhruvPathak Sep 20 '11 at 07:34
  • Not a duplicate; the subtle point is that I already have log(a) and log(b). The question you link to is more of a math question (i.e. what is the expansion of log(a+b)?), not a programming / numerical methods question (i.e. how do I do this fast & accurately?) – user Sep 20 '11 at 15:30
  • log(0) works for me... maybe because I am using numpy? – user Sep 20 '11 at 16:39

2 Answers2

8

Note: Best answer until now is to simply use numpy.logaddexp(logA,logB).

Why exactly do you compare with log(0)? This is equal to -numpy.inf, in this case you come to log(1 + math.exp(-inf-logB) ) + logB Which reduces itself to logB. This call always will give an warning message which is extremely slow.

I could come up with this one-liner. However you'll need to really measure to see if this is actually faster. It does only use one 'complex' calculation function instead of the two that you use, and no recursion is happening, the if is still there but hidden (and maybe optimized) in fabs/maximum.

def log_add(logA,logB):
    return numpy.logaddexp(0,-numpy.fabs(logB-logA)) + numpy.maximum(logA,logB)

edit:

I did a quick timeit() with following results :

  1. Your original version took about 120s
  2. My version took about 30s
  3. I removed the compare with log(0) from your version and it came down to 20s
  4. I edited my code to keep the logaddexp but also worked with your recursive if and it went down to 18s.

Updated code, you could also switch the recursive call with an inline updated formula but this made little difference in my timing tests:

def log_add2(logA, logB):
    if logA < logB:
        return log_add2(logB, logA)
    return numpy.logaddexp(0,logB-logA)+logA

Edit 2:

As pv noted in comments, you could actually just do numpy.logaddexp(logA, logB) which comes down to calculating log(exp(logA)+exp(logB)) which is of course equal to log(A+B). I timed it (on the same machine as above) and it went further down to about 10s. So we've come down to about 1/12, not bad ;).

KillianDS
  • 16,936
  • 4
  • 61
  • 70
  • I still may need to do something else (as I said, this is where a great deal of my CPU time is spent), but still, your 6x speedup is great! +1 – user Sep 20 '11 at 17:44
  • 3
    Why not just `numpy.logaddexp(logA, logB)`? This function is supposed to be numerically stable (and if it is not, that's a bug). – pv. Sep 20 '11 at 21:49
  • @pv +1 Are you sure that function does the same thing? If so, that's exactly what I want (something as built-in as possible)! – user Sep 21 '11 at 04:13
  • 1
    @pv: nice catch, I was so focused on the original addition formula I forgot the basic log/exp formulas. Edited the answer with timing. – KillianDS Sep 21 '11 at 06:23
  • +1 Wonderful, with a small change something is now 12x faster. – user Sep 22 '11 at 21:24
-1
def log_add(logA, logB): 
    return math.log(math.exp(logA) + math.exp(logB))

is too slow? or inaccurate?

Roshan Mathews
  • 5,788
  • 2
  • 26
  • 36
  • How could two `exp` be faster than one? – agf Sep 20 '11 at 07:23
  • 2
    The primary concern here is accuracy. If logA and logB are both very small (for instance, -1000), then exponentiating them will give you 0.0, and then the log will give you -inf. Factoring out the larger of the two and then exponentiating will give you back something close to -1000 (the right answer). – user Sep 20 '11 at 15:26