4

(1) The simple version of the problem:

How to calculate log(P1+P2+...+Pn), given log(P1), log(P2), ..., log(Pn), without taking the exp of any terms to get the original Pi. I don't want to get the original Pi because they are super small and may cause numeric computer underflow.

(2) The long version of the problem:

I am using Bayes' Theorem to calculate a conditional probability P(Y|E).

P(Y|E) = P(E|Y)*P(Y) / P(E)

I have a thousand probabilities multiplying together.

P(E|Y) = P(E1|Y) * P(E2|Y) * ... * P(E1000|Y) 

To avoid computer numeric underflow, I used log(p) and calculate the summation of 1000 log(p) instead of calculating the product of 1000 p.

log(P(E|Y)) = log(P(E1|Y)) + log(P(E2|Y)) + ... + log(P(E1000|Y))

However, I also need to calculate P(E), which is

P(E) = sum of P(E|Y)*P(Y)

log(P(E)) does not equal to the sum of log(P(E|Y)*P(Y)). How should I get log(P(E)) without solving for P(E|Y)*P(Y) (they are extremely small numbers) and adding them.

09stephenb
  • 9,358
  • 15
  • 53
  • 91
wen
  • 1,875
  • 4
  • 26
  • 43
  • Do you need the analytic answer? If not you could use log(P(E)) = log[n + sum_{i=1}^n log(P_i)] as an approximation – Azrael3000 Feb 25 '14 at 12:59

1 Answers1

2

You can use

log(P1+P2+...+Pn) = log(P1[1 + P2/P1 + ... + Pn/P1]) 
                  = log(P1) + log(1 + P2/P1 + ... + Pn/P1])

which works for any Pi. So factoring out maxP = max_i Pi results in

log(P1+P2+...+Pn) = log(maxP) + log(1+P2/maxP + ... + Pn/maxP)

where all the ratios are less than 1.

jaradniemi
  • 618
  • 4
  • 15