4

I am working on some programming problems where I have to transition probabilities between standard space and log-space. For this purpose, I am trying to find out the maximum absolute error for the floating-point error in R for the computation log(exp(...)) where the input is a log-probability (i.e., a non-positive number).

At the moment I have computed the answer by using a grid search (see code and plot below), but I'm not sure if the value I've computed is correct. (I checked some other ranges but the range shown in the plot seems to get the maximum absolute error.)

#Set function for computing floating-point error of log(exp(...))
fp.error <- function(x) { abs(log(exp(x)) - x) }

#Compute and plot floating-point error over a grid of non-negative values
xx <- -(0:20000/10000)
ff <- fp.error(xx)
plot(xx, ff, col = '#0000FF10',
     main = 'Error in computation of log(exp(...))', 
     xlab = 'x', ylab = 'Floating-Point Error')

#Compute maximum floating-point error
fp.error.max <- max(ff)
fp.error.max
[1] 1.110223e-16

enter image description here

From this analysis, my estimated maximum absolute error is half the size of .Machine$double.eps (which is 2.220446e-16). I am not sure if there is a theoretical reason for this, or if I am getting the wrong answer.

Question: Is there any way to determine if this is really the maximum floating-point error for this computation? Is there any theoretical way to compute the maximum, or is this kind of grid-search method sufficient?

Ben
  • 1,051
  • 8
  • 26
  • Are you only concerned with [0...2.0]? What about larger values? – chux - Reinstate Monica Jun 21 '21 at 13:50
  • @chux: Edited --- I checked some other ranges but the one shown seems to get the biggest absolute error. – Ben Jun 21 '21 at 22:57
  • if you're working with log-probability/likelihood, then it's good to sway away from exponentiating if possible. there are functions like `logsumexp` that allow you to stay in log-space where possible, see also `log1p` and `expm1` for more numerically stable functions – Sam Mason Jun 22 '21 at 11:55

3 Answers3

2

I think you got the right answer. Here I refined the step as small as sqrt(.Machine$double.eps), you will see

> x <- seq(0, 2, by = sqrt(.Machine$double.eps))

> max(abs(log(exp(x)) - x))
[1] 1.110725e-16

However, once your x is extremely large, you will have Inf error, e.g.,

> (x <- .Machine$double.xmax)
[1] 1.797693e+308

> max(abs(log(exp(x)) - x))
[1] Inf
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • note that you'll get `Inf` for any `x > log(.Machine$double.xmax)`, i.e. around 710 – Sam Mason Jun 21 '21 at 12:43
  • @SamMason Yes. Actually `Inf` stems from `exp`, then `log(exp(x))` gives `Inf` in turn. I just want to show OP the possible error value. – ThomasIsCoding Jun 21 '21 at 12:46
  • note, I meant to suggest using `log` to recover the value that would cause `exp` to overflow – Sam Mason Jun 21 '21 at 12:57
  • @SamMason It seems OP is not looking for workarounds but the possible errors by `log(exp(...))`. If OP wants to minimize the risk of errors, `exp(log(x))` should be applied instead. – ThomasIsCoding Jun 21 '21 at 12:58
  • @ThomasIsCoding: Sorry to be a pain --- I just updated my question. I'm actually dealing with inputs that are log-probabilities so they are non-positive (which might make a difference to your answer). – Ben Jun 21 '21 at 22:53
2

The error of log(exp(x)) depends on the value of x. If you use a float also x has an precision which depending on its value. The precission could be calculated with nextafter from C:

library(Rcpp)
cppFunction("double getPrec(double x) {
  return nextafter(x, std::numeric_limits<double>::infinity()) - x;}")

getPrec(2)
#[1] 4.440892e-16

getPrec(exp(2))
#[1] 8.881784e-16

or not using Rcpp:

getPrecR <- function(x) {
  y <- log2(pmax(.Machine$double.xmin, abs(x)))
  ifelse(x < 0 & floor(y) == y, 2^(y-1), 2^floor(y)) * .Machine$double.eps
}

Have also a look at: What is the correct/standard way to check if difference is smaller than machine precision?.

GKi
  • 37,245
  • 2
  • 26
  • 48
  • Sorry to be a pain --- I just updated my question. I'm actually dealing with inputs that are log-probabilities so they are non-positive. – Ben Jun 21 '21 at 22:54
  • 1
    For those Values the maximum error will be in the range of: `getPrec(0)` gives `4.940656e-324` so values from `0 - 4.940656e-324/2` to `0 + 4.940656e-324/2` will be stored as `0` and `getPrec(exp(0))` gives `2.220446e-16` what gives a range of `1 - 2.220446e-16/2` to `1 + 2.220446e-16/2` for `1`. So your calculated Error of `1.110223e-16` says a `1` is a `1` but a `1` stored as a float can range from `1 +- 1.110223e-16` so I would say the maximum possible error is in the range of `4.940656e-324 + 2.220446e-16` when only considering the precision of `0` and `1`. – GKi Jun 22 '21 at 06:53
2

In general, I'd suggest using a randomised approach to generate a lot more xs, e.g.:

x <- runif(10000000, 0, 2)

Your regularly spaced values might happen to stumble on a pattern that "just works".

Also it depends on whether you care about absolute error or relative error. The absolute error should be close to .Machine$double.xmax while relative error increases as x approaches zero. E.g. log(exp(1e-16)) gets truncated to zero.

Sam Mason
  • 15,216
  • 1
  • 41
  • 60