224

Today, I was looking through some C++ code (written by somebody else) and found this section:

double someValue = ...
if (someValue <  std::numeric_limits<double>::epsilon() && 
    someValue > -std::numeric_limits<double>::epsilon()) {
  someValue = 0.0;
}

I'm trying to figure out whether this even makes sense.

The documentation for epsilon() says:

The function returns the difference between 1 and the smallest value greater than 1 that is representable [by a double].

Does this apply to 0 as well, i.e. epsilon() is the smallest value greater than 0? Or are there numbers between 0 and 0 + epsilon that can be represented by a double?

If not, then isn't the comparison equivalent to someValue == 0.0?

Yakov Galka
  • 70,775
  • 16
  • 139
  • 220
Sebastian Krysmanski
  • 8,114
  • 10
  • 49
  • 91
  • 3
    The epsilon around 1 will most likely be much higher than that around 0, so there will probably be values between 0 and 0+epsilon_at_1. I guess the author of this section wanted to use something small, but he didn't want to use a magic constant, so he just used this essentially arbitrary value. – enobayram Dec 04 '12 at 08:53
  • 2
    Comparing Floating point numbers is tough, and usage of epsilon or threshold value is even encouraged. Please refer: http://www.cs.princeton.edu/introcs/91float/ and http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm – Aditya Kumar Pandey Dec 04 '12 at 08:54
  • 45
    First link is 403.99999999 – graham.reeds Dec 04 '12 at 08:55
  • 1
    epsilon() is the smallest positive value. So if we assume that epsilon() is e, we've got that 1+e != 1, so yes, epsilon is the smallest value greater than 0 and there are no numbers between 0 and 0 + e – westwood Dec 04 '12 at 08:58
  • 6
    IMO, in this case the usage of `numeric_limits<>::epsilon` is misleading and irrelevant. What we want is to assume 0 if the actual value differs by no more than some ε from 0. And ε should be chosen based on the problem specification, not on a machine-dependent value. I'd suspect that the current epsilon is useless, as even just a few FP operations could accumulate an error greater than that. – Andrey Vihrov Dec 04 '12 at 09:47
  • 1
    +1. epsilon is not the smallest possible but can serve the given purpose in most of practical engineering tasks if you know what precision you need and what you are doing. – SChepurin Dec 04 '12 at 09:51
  • @Apokal: please read what Aditya Kumar Pande wrote. The concept of the mashine epsilon is different from what you write. – H. Brandsmeier Dec 04 '12 at 12:27
  • The answers to your three explicit questions are **no**, **yes** and **no**. Questions to ask next are 'what are double precision floating point numbers?' and 'how can I reliably compare two doubles?' – Colonel Panic Dec 04 '12 at 14:06
  • If you're thinking of closing with this question as a dupe target, STOP! And consider http://stackoverflow.com/q/17333/103167 instead. – Ben Voigt Apr 21 '15 at 03:41

12 Answers12

206

Assuming 64-bit IEEE double, there is a 52-bit mantissa and 11-bit exponent. Let's break it to bits:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^0 = 1

The smallest representable number greater than 1:

1.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^0 = 1 + 2^-52

Therefore:

epsilon = (1 + 2^-52) - 1 = 2^-52

Are there any numbers between 0 and epsilon? Plenty... E.g. the minimal positive representable (normal) number is:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^-1022 = 2^-1022

In fact there are (1022 - 52 + 1)×2^52 = 4372995238176751616 numbers between 0 and epsilon, which is 47% of all the positive representable numbers...

Yakov Galka
  • 70,775
  • 16
  • 139
  • 220
  • 31
    So weird that you can say "47% of the positive numbers" :) – configurator Dec 04 '12 at 16:38
  • 18
    @configurator: Nah, you cannot say that (no 'natural' finite measure exists). But you can say "47% of the positive *representable* numbers". – Yakov Galka Dec 04 '12 at 16:42
  • 1
    @ybungalobill I can't figure it out. Exponent has 11 bits: 1 sign bit and 10 value bits. Why 2^-1022 and not 2^-1024 is the smallest positive number? – Pavlo Dyban Dec 05 '12 at 08:29
  • 4
    @PavloDyban: simply because exponents *do not* have a sign bit. They are encoded as offsets: if the encoded exponent is `0 <= e < 2048` then the mantissa is multiplied by 2 to the power of `e - 1023`. E.g. exponent of `2^0` is encoded as `e=1023`, `2^1` as `e=1024` and `2^-1022` as `e=1`. The value of `e=0` is reserved for subnormals and the real zero. – Yakov Galka Dec 05 '12 at 14:17
  • 3
    @PavloDyban: also `2^-1022` is the smallest *normal* number. The smallest number is actually `0.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^-1022 = 2^-1074`. This is subnormal, meaning that the mantissa part is smaller than 1, so it is encoded with the exponent `e=0`. – Yakov Galka Dec 05 '12 at 14:21
19

The test certainly is not the same as someValue == 0. The whole idea of floating-point numbers is that they store an exponent and a significand. They therefore represent a value with a certain number of binary significant figures of precision (53 in the case of an IEEE double). The representable values are much more densely packed near 0 than they are near 1.

To use a more familiar decimal system, suppose you store a decimal value "to 4 significant figures" with exponent. Then the next representable value greater than 1 is 1.001 * 10^0, and epsilon is 1.000 * 10^-3. But 1.000 * 10^-4 is also representable, assuming that the exponent can store -4. You can take my word for it that an IEEE double can store exponents less than the exponent of epsilon.

You can't tell from this code alone whether it makes sense or not to use epsilon specifically as the bound, you need to look at the context. It may be that epsilon is a reasonable estimate of the error in the calculation that produced someValue, and it may be that it isn't.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • 2
    Good point, but even if that is the case, a better practice would be to keep the error bound in a reasonably named variable and use it in the comparison. As it stands, it is no different from a magic constant. – enobayram Dec 04 '12 at 08:58
  • Perhaps I should've been clearer in my question: I didn't question whether epsilon was a large enough "threshold" to cover computational error but whether this comparison is equal to `someValue == 0.0` or not. – Sebastian Krysmanski Dec 04 '12 at 09:02
14

There are numbers that exist between 0 and epsilon because epsilon is the difference between 1 and the next highest number that can be represented above 1 and not the difference between 0 and the next highest number that can be represented above 0 (if it were, that code would do very little):-

#include <limits>

int main ()
{
  struct Doubles
  {
      double one;
      double epsilon;
      double half_epsilon;
  } values;

  values.one = 1.0;
  values.epsilon = std::numeric_limits<double>::epsilon();
  values.half_epsilon = values.epsilon / 2.0;
}

Using a debugger, stop the program at the end of main and look at the results and you'll see that epsilon / 2 is distinct from epsilon, zero and one.

So this function takes values between +/- epsilon and makes them zero.

Skizz
  • 69,698
  • 10
  • 71
  • 108
5

An aproximation of epsilon (smallest possible difference) around a number (1.0, 0.0, ...) can be printed with the following program. It prints the following output:
epsilon for 0.0 is 4.940656e-324
epsilon for 1.0 is 2.220446e-16
A little thinking makes it clear, that the epsilon gets smaller the more smaller the number is we use for looking at its epsilon-value, because the exponent can adjust to the size of that number.

#include <stdio.h>
#include <assert.h>
double getEps (double m) {
  double approx=1.0;
  double lastApprox=0.0;
  while (m+approx!=m) {
    lastApprox=approx;
    approx/=2.0;
  }
  assert (lastApprox!=0);
  return lastApprox;
}
int main () {
  printf ("epsilon for 0.0 is %e\n", getEps (0.0));
  printf ("epsilon for 1.0 is %e\n", getEps (1.0));
  return 0;
}
pbhd
  • 4,384
  • 1
  • 20
  • 26
4

The difference between X and the next value of X varies according to X.
epsilon() is only the difference between 1 and the next value of 1.
The difference between 0 and the next value of 0 is not epsilon().

Instead you can use std::nextafter to compare a double value with 0 as the following:

bool same(double a, double b)
{
  return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b
    && std::nextafter(a, std::numeric_limits<double>::max()) >= b;
}

double someValue = ...
if (same (someValue, 0.0)) {
  someValue = 0.0;
}
Daniel Laügt
  • 1,097
  • 1
  • 12
  • 17
  • 1
    +1 for mentioning `nextafter`; but note that this usage doesn't likely do what the programmer is intending. Assuming 64-bit IEEE 754, in your example `same(0, 1e-100)` returns false, which is probably not what the programmer wants. The programmer probably rather wants some small threshold to test for equality, e.g. +/-`1e-6` or +/-`1e-9`, instead of +/-`nextafter`. – villapx Sep 15 '20 at 21:08
3

Suppose we are working with toy floating point numbers that fit in a 16 bit register. There is a sign bit, a 5 bit exponent, and a 10 bit mantissa.

The value of this floating point number is the mantissa, interpreted as a binary decimal value, times two to the power of the exponent.

Around 1 the exponent equals zero. So the smallest digit of the mantissa is one part in 1024.

Near 1/2 the exponent is minus one, so the smallest part of the mantissa is half as large. With a five bit exponent it can reach negative 16, at which point the smallest part of the mantissa is worth one part in 32m. And at negative 16 exponent, the value is around one part in 32k, much closer to zero than the epsilon around one we calculated above!

Now this is a toy floating point model that does not reflect all the quirks of a real floating point system , but the ability to reflect values smaller than epsilon is reasonably similar with real floating point values.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
2

You can't apply this to 0, because of mantissa and exponent parts. Due to exponent you can store very little numbers, which are smaller than epsilon, but when you try to do something like (1.0 - "very small number") you'll get 1.0. Epsilon is an indicator not of value, but of value precision, which is in mantissa. It shows how many correct consequent decimal digits of number we can store.

Arsenii Fomin
  • 3,120
  • 3
  • 22
  • 42
1

I think that depend on the precision of your computer. Take a look on this table: you can see that if your epsilon is represented by double, but your precision is higher, the comparison is not equivalent to

someValue == 0.0

Good question anyway!

default
  • 11,485
  • 9
  • 66
  • 102
Luca Davanzo
  • 21,000
  • 15
  • 120
  • 146
1

So let's say system cannot distinguish 1.000000000000000000000 and 1.000000000000000000001. that is 1.0 and 1.0 + 1e-20. Do you think there still are some values that can be represented between -1e-20 and +1e-20?

cababunga
  • 3,090
  • 15
  • 23
  • Except for zero, I don't think that there are values between -1e-20 and +1e-20. But just because I think this doesn't make it true. – Sebastian Krysmanski Dec 04 '12 at 08:52
  • @SebastianKrysmanski: it's not true, there are lots of floating-point values between 0 and `epsilon`. Because it's *floating* point, not fixed point. – Steve Jessop Dec 04 '12 at 08:53
  • The smallest representable value that is distinct from zero is limited by number of bits allocated to represent exponent. So if double has 11 bit exponent, the smallest number would be 1e-1023. – cababunga Dec 04 '12 at 09:07
1

With IEEE floating-point, between the smallest non-zero positive value and the smallest non-zero negative value, there exist two values: positive zero and negative zero. Testing whether a value is between the smallest non-zero values is equivalent to testing for equality with zero; the assignment, however, may have an effect, since it would change a negative zero to a positive zero.

It would be conceivable that a floating-point format might have three values between the smallest finite positive and negative values: positive infinitesimal, unsigned zero, and negative infinitesimal. I am not familiar with any floating-point formats that in fact work that way, but such a behavior would be perfectly reasonable and arguably better than that of IEEE (perhaps not enough better to be worth adding extra hardware to support it, but mathematically 1/(1/INF), 1/(-1/INF), and 1/(1-1) should represent three distinct cases illustrating three different zeroes). I don't know whether any C standard would mandate that signed infinitesimals, if they exist, would have to compare equal to zero. If they do not, code like the above could usefully ensure that e.g. dividing a number repeatedly by two would eventually yield zero rather than being stuck on "infinitesimal".

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Isn't "1/(1-1)" (from your example) infinity rather than zero? – Sebastian Krysmanski Dec 05 '12 at 08:34
  • The quantities (1-1), (1/INF), and (-1/INF) all represent zero, but dividing a positive number by each of them should in theory yield three different results (IEEE math regards the first two as identical). – supercat Dec 05 '12 at 14:22
0

"the difference between 1 and the smallest value greater than 1" means one + "machine zero" which is around 10^-8 or 10^-16 depending whether you use float of double variables, respectively. To see the machine zero, you can divide 1 by 2 until the computer sees 1 = 1+1/2^p, as below:

#include <iostream>
#include "math.h"
using namespace std;

int main() {
    float a = 1;
    int n = 0;
    while(1+a != 1){
        a = a/2;
        n +=1;
    }
    cout << n-1 << endl << pow(2,-n);
    return 0;
} 
-1

Also, a good reason for having such a function is to remove "denormals" (those very small numbers that can no longer use the implied leading "1" and have a special FP representation). Why would you want to do this? Because some machines (in particular, some older Pentium 4s) get really, really slow when processing denormals. Others just get somewhat slower. If your application doesn't really need these very small numbers, flushing them to zero is a good solution. Good places to consider this are the last steps of any IIR filters or decay functions.

See also: Why does changing 0.1f to 0 slow down performance by 10x?

and http://en.wikipedia.org/wiki/Denormal_number

Community
  • 1
  • 1
Dithermaster
  • 6,223
  • 1
  • 12
  • 20
  • 2
    This removes many more numbers than just denormalised numbers. It changes Planck's constant or the mass of an electron to zero which will give you very, very wrong results if you used these numbers. – gnasher729 May 25 '16 at 08:43