7

What is the most numerically stable way of calculating:

log[(wx * exp(x) + wy * exp_y)/(wx + wy)]

where the weights wx, wy > 0?

Without the weights, this function is logaddexp and could be implemented in Python with NumPy as:

tmp = x - y
return np.where(tmp > 0,
                x + np.log1p(np.exp(-tmp)),
                y + np.log1p(np.exp(tmp)))

How should I generalize this to the weighted version?

Neil G
  • 32,138
  • 39
  • 156
  • 257

2 Answers2

5

You could use the original logaddexp function for thus purpose, if you rewrite the weighted expression as,

new logadd expression

This is equivalent to,

logaddexp( x + log(w_x), y + log(w_y) ) - log(w_x + w_y)

which should be as numerically stable as the original logaddexp implementation.

Note: I'm referring to the numpy.logaddexp function that takes in x and y, not x and exp_y, as you mention in the question.

rth
  • 10,680
  • 7
  • 53
  • 77
  • Looks like it is probably better than what I did, which I will add as an answer for comparison. – Neil G Jul 12 '15 at 18:50
  • 1
    For anyone reading, I tested this using the arbitrary-precision library `mpmath` and found that it is much better than my solution. – Neil G Jul 16 '15 at 10:00
  • @NeilG Yeah, I suppose no matter how you rewrite it, you are still going to loose in precision/overflow with 64 bit floats etc, when taking an exponential of large numbers and computing the log back. `mpmath` does seem to be a good choice here, although it will be slower. – rth Jul 16 '15 at 10:04
  • 1
    I'm not using `mpmath` in my code. I just used it to measure the average error of each of our solutions so that I could know which was actually better. `mpmath` gives an arbitrary-precision answer. My intuition when writing my solution was to use `log1p` as much possible. This turned out to be much worse. – Neil G Jul 16 '15 at 10:07
1
def weighted_logaddexp(x, wx, y, wy):
    # Returns:
    #   log[(wx * exp(x) + wy * exp_y)/(wx + wy)]
    #   = log(wx/(wx+wy)) + x + log(1 + exp(y - x + log(wy)-log(wx)))
    #   = log1p(-wy/(wx+wy)) + x + log1p((wy exp_y) / (wx exp(x)))
    if wx == 0.0:
        return y
    if wy == 0.0:
        return x
    total_w = wx + wy
    first_term = np.where(wx > wy,
                          np.log1p(-wy / total_w),
                          np.log1p(-wx / total_w))
    exp_x = np.exp(x)
    exp_y = np.exp(y)
    wx_exp_x = wx * exp_x
    wy_exp_y = wy * exp_y
    return np.where(wy_exp_y < wx_exp_x,
                    x + np.log1p(wy_exp_y / wx_exp_x),
                    y + np.log1p(wx_exp_x / wy_exp_y)) + first_term

Here's how I compared the two solutions:

import math
import numpy as np
import mpmath as mp
from tools.numpy import weighted_logaddexp

def average_error(ideal_function, test_function, n_args):
    x_y = [np.linspace(0.1, 3, 20) for _ in range(n_args)]
    xs_ys = np.meshgrid(*x_y)

    def e(*args):
        return ideal_function(*args) - test_function(*args)
    e = np.frompyfunc(e, n_args, 1)
    error = e(*xs_ys) ** 2
    return np.mean(error)


def ideal_function(x, wx, y, wy):
    return mp.log((mp.exp(x) * wx + mp.exp(y) * wy) / mp.fadd(wx, wy))

def test_function(x, wx, y, wy):
    return np.logaddexp(x + math.log(wx), y + math.log(wy)) - math.log(wx + wy)

mp.prec = 100
print(average_error(ideal_function, weighted_logaddexp, 4))
print(average_error(ideal_function, test_function, 4))
Neil G
  • 32,138
  • 39
  • 156
  • 257