504

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.

For example: Comparing Floating Point Numbers, 2012 Edition

What is the recommended way to deal with this in Python?

Is a standard library function for this somewhere?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Gordon Wrigley
  • 11,015
  • 10
  • 48
  • 62
  • @tolomea: Since it depends on your application and your data and your problem domain -- and it's only one line of code -- why would there be a "standard library function"? – S.Lott Apr 08 '11 at 13:23
  • 14
    @S.Lott: `all`, `any`, `max`, `min` are each basically one-liners, and they aren't just provided in a library, they're builtin functions. So the BDFL's reasons aren't that. The one line of code that most people write is pretty unsophisticated and often doesn't work, which is a strong reason to provide something better. Of course any module providing other strategies would have to also provide caveats describing when they're appropriate, and more importantly when they aren't. Numeric analysis is hard, it's no great disgrace that language designers usually don't attempt tools to help with it. – Steve Jessop Apr 08 '11 at 17:49
  • @Steve Jessop. Those collection-oriented functions don't have the application, data and problem domain dependencies that float-point does. So the "one-liner" clearly isn't as important as the real reasons. Numeric analysis is hard, and can't be a first-class part of a general-purpose language library. – S.Lott Apr 08 '11 at 17:53
  • 7
    @S.Lott: I'd probably agree if the standard Python distribution didn't come with *multiple* modules for XML interfaces. Clearly the fact that different applications need to do something differently is no bar at all to putting modules in the base set to do it one way or another. Certainly there are tricks for comparing floats that get re-used a lot, the most basic being a specified number of ulps. So I only partially agree - the problem is that numeric analysis is hard. Python *could* in principle provide tools to make it somewhat easier, some of the time. I guess nobody has volunteered. – Steve Jessop Apr 08 '11 at 18:01
  • @Steve Jessop. It depends too much your application and your data and your problem domain. Since it boils down to one hard-to-design line of code, the libraries don't add much. A book like *Numerical Recipes in Python* would add more than a library which trivializes hard problems. – S.Lott Apr 08 '11 at 18:11
  • @S.Lott: so I guess the question is whether the code in that book is actually worth using. If so, there's no particular reason it couldn't be a library, although granted people who haven't read the book might have difficulty using it correctly. Then again, people who don't read a book on Unicode can easily screw up using `str` correctly. Once you have a library of valuable tools, whether it's made core could be assessed under whatever the usual criteria are. The difference between, say text processing (very application- and data-specific) and numeric processing is how many people need it. – Steve Jessop Apr 08 '11 at 18:26
  • 4
    Also, "it boils down to one hard-to-design line of code" - if it's still a one-liner once you're doing it properly, I think your monitor is wider than mine ;-). Anyway, I think the whole area is quite specialized, in the sense that *most* programmers (including me) very rarely use it. Combined with being hard, it's not going to hit the top of the "most wanted" list for core libraries in most languages. – Steve Jessop Apr 08 '11 at 18:31
  • @S.Lott So far we really only seem to have two workable solutions, one of which is only valid if you know the magnitude of the inputs. Sure the epsilon value(s) need to be adjusted based on application, but they are arguments. – Gordon Wrigley Apr 09 '11 at 23:05
  • Never had such problems with Matlab, why? – bonobo Aug 29 '22 at 07:41
  • @bonobo Maybe you aren't trying hard enough https://stackoverflow.com/questions/23824577/what-are-the-best-practices-for-floating-point-comparisons-in-matlab – Gordon Wrigley Aug 30 '22 at 09:13

18 Answers18

496

Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.

If you're using an earlier version of Python, the equivalent function is given in the documentation.

def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
    return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)

rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.

abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.

Adil
  • 4,503
  • 10
  • 46
  • 63
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • 51
    note when `a` or `b` is a `numpy` `array`, [`numpy.isclose`](http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.isclose.html) works. – abcd Oct 29 '15 at 20:52
  • 6
    @marsh `rel_tol` is a *relative tolerance*, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal. `abs_tol` is an *absolute tolerance* that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal. – Mark Ransom Nov 11 '15 at 23:16
  • 5
    Not to diminish the value of this answer (I think it's a good one), it's worth noting that the documentation also says: "Modulo error checking, etc, the function will return the result of..." In other words, the `isclose` function (above) is not a _complete_ implementation. – rkersh Jul 14 '16 at 19:50
  • 5
    Apologies for reviving an old thread, but it seemed worth pointing out that `isclose` always adheres to the _less_ conservative criterion. I only mention it because that behavior is counterintuitive to me. Were I to specify two criteria, I would always expect the smaller tolerance to supercede the greater. – Mackie Messer Mar 21 '17 at 13:57
  • 6
    @MackieMesser you're entitled to your opinion of course, but this behavior made perfect sense to me. By your definition nothing could ever be "close to" zero, because a relative tolerance multiplied by zero is always zero. – Mark Ransom Mar 21 '17 at 15:32
  • 1
    Ah! That's an excellent point that totally slipped past me. It does make sense from that perspective. In fact this is exactly how I would expect tolerances to be interpreted were both absolute and relative specified for optimization etc. Great point. I will leave it up for the benefit of others to have the same realization. – Mackie Messer Mar 21 '17 at 16:35
  • 1
    This function will not work when say a=0.0000000000000001 and b=0.0 this will return false.so when b=0.0 it seems there is issue with this function – PapaDiHatti Aug 29 '17 at 08:07
  • @Kapil you can specify an `abs_tol` to get around that. That's why there are parameters, the makers of this function knew they couldn't provide defaults that work in every situation. – Mark Ransom Aug 29 '17 at 12:43
  • @MackieMesser Using the smaller of both values as a criterion of rejection would suggest the interpretation of an "intolerance". But it is indeed somewhat counterintuitive at first. – NoBackingDown Dec 01 '17 at 06:43
  • I don't know how Python 3.5 and PEP 485 handle this, but this doesn't handle the case when either argument is `float("inf")` very well. It declares that "2" isclose to "float('inf')"! I added a case to handle infinite values in my implementation... – Dan H Apr 16 '18 at 22:08
  • @DanH there's a comment up above that states this function from the documentation is not a *complete* implementation. I imagine that `float('inf')` is the kind of thing they had in mind. – Mark Ransom Apr 16 '18 at 23:38
  • @dbliss [`numpy.isclose`](http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.isclose.html) works for regular arrays too – Guy Oct 22 '18 at 11:47
  • In your equivalent function, you use `<=`,it means less than or equal,however,`max(rel_tol * max(abs(a), abs(b)), abs_tol)` is a float number , and `abs(a-b)` is also a float number, how does the `equal` symbol work ? – Kingname Dec 07 '19 at 12:29
  • @Kingname equals works fine on floats if they're actually equal. That happens sometimes. – Mark Ransom Dec 07 '19 at 13:01
  • @Kapil you might want to do this `defaultAbsTol = a is 0 or b is 0 ? relTol : 0` – Rivenfall Feb 26 '20 at 10:39
  • @Rivenfall why would that be better than just picking an appropriate `abs_tol` based on your expected range of data? Also it might be confusing to be using C syntax for a Python question. – Mark Ransom Feb 26 '20 at 16:38
  • @MarkRansom the interest of the math.isclose function is rel_tol. It's quite easy to compare floats with an abs_tol. But my comment was about Kapil's warning that 0 and 1e-15 are still considered different when doing relative comparison. – Rivenfall Feb 26 '20 at 18:03
  • "less than *either*" means "less than *both*"? – bers Apr 24 '21 at 17:53
  • 1
    @bers no. "Either" means `or`, "both" means `and`. Only one of the conditions needs to be true for the values to be considered "close". – Mark Ransom Apr 24 '21 at 19:37
  • Ah, it takes another `max` right there. Thanks, understood! – bers Apr 25 '21 at 08:43
107

Something as simple as the following may be good enough:

return abs(f1 - f2) <= allowed_error
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Andrew White
  • 52,720
  • 19
  • 113
  • 137
  • 13
    As the link I provided points out, subtracting only works if you know the approximate magnitude of the numbers in advance. – Gordon Wrigley Apr 08 '11 at 13:14
  • 1
    This is way after the fact, but here is an alternative expression of @AndrewWhite's answer: return round(f1 - f2, some_precision_defaulting_to_seven_decimals) == 0. – A. Wilson Aug 19 '13 at 17:19
  • 16
    In my experience, the best method for comparing floats is: `abs(f1-f2) < tol*max(abs(f1),abs(f2))`. This sort of relative tolerance is the only meaningful way to compare floats in general, as they are usually affected by roundoff error in the small decimal places. – Sesquipedal Feb 04 '15 at 01:52
  • 7
    Just adding a simple example why it may not work: `>>> abs(0.04 - 0.03) <= 0.01`, it yields `False`. I use `Python 2.7.10 [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin` – schatten Sep 21 '15 at 00:28
  • 3
    @schatten to be fair, that example has more to do with machine binary precision/formats than the particular comparison algo. When you entered 0.03 into the system, that's not really the number that made it to the CPU. – Andrew White Sep 21 '15 at 01:31
  • 6
    @AndrewWhite that example shows that `abs(f1 - f2) <= allowed_error` does not work as expected. – schatten Sep 21 '15 at 01:35
  • ALMOST WRONG, the allowed_error depends on on the size of f1 and f2. A proper answer needs to at least account for that. `abs(a-b) <= eps*max(abs(a),abs(b))` as a short cut, but more appropriate answers are below. – Davoud Taghawi-Nejad Nov 16 '15 at 15:43
72

I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.

But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.

numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)

A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.

nbro
  • 15,395
  • 32
  • 113
  • 196
J.Makela
  • 1,034
  • 9
  • 9
  • 3
    "Installing numpy can be a non-trivial experience depending on your platform."...um What? Which platforms is it "non-trivial" to install numpy? What exactly made it non-trivial? – John Nov 14 '14 at 18:36
  • 10
    @John: hard to get a 64-bit binary for Windows. Hard to get numpy via `pip` on Windows. – Ben Bolker Mar 06 '15 at 02:05
  • @Ternak: I do, but some of my students use Windows, so I have to deal with this stuff. – Ben Bolker Nov 24 '15 at 03:34
  • 5
    @BenBolker If you have to install open data science platform powered by Python, the best way is Anaconda https://www.continuum.io/downloads (pandas, numpy and more out of the box) – jrovegno Dec 27 '16 at 20:22
  • 1
    Use `numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False).all()` to get a single True/False value if the two arrays are equal. – Contango Jun 21 '20 at 18:28
  • 1
    **It is extremely non-trivial to install NumPy on macOS with `pip`.** This is thanks to [Apple's blatantly broken multithreaded implementation of their *Accelerate* BLAS replacement, which neither NumPy or `pip` have control over](https://github.com/numpy/numpy/issues/15947). The only solution is to **(A)** force upgrade `pip`, `setuptools`, and `wheel` and **(B)** force reinstallation of NumPy with `--force-reinstall`. This recently brought [beartype's entire CI pipeline to its knees](https://github.com/beartype/beartype/runs/3257420924) – among others. Thanks, Apple! o_O – Cecil Curry Aug 06 '21 at 05:46
19

Use Python's decimal module, which provides the Decimal class.

From the comments:

It is worth noting that if you're doing math-heavy work and you don't absolutely need the precision from decimal, this can really bog things down. Floats are way, way faster to deal with, but imprecise. Decimals are extremely precise but slow.

jathanism
  • 33,067
  • 9
  • 68
  • 86
15

math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.

def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
    '''
    Python 2 implementation of Python 3.5 math.isclose()
    https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
    '''
    # sanity check on the inputs
    if rel_tol < 0 or abs_tol < 0:
        raise ValueError("tolerances must be non-negative")

    # short circuit exact equality -- needed to catch two infinities of
    # the same sign. And perhaps speeds things up a bit sometimes.
    if a == b:
        return True

    # This catches the case of two infinities of opposite sign, or
    # one infinity and one finite number. Two infinities of opposite
    # sign would otherwise have an infinite relative tolerance.
    # Two infinities of the same sign are caught by the equality check
    # above.
    if math.isinf(a) or math.isinf(b):
        return False

    # now do the regular computation
    # this is essentially the "weak" test from the Boost library
    diff = math.fabs(b - a)
    result = (((diff <= math.fabs(rel_tol * b)) or
               (diff <= math.fabs(rel_tol * a))) or
              (diff <= abs_tol))
    return result
user2745509
  • 411
  • 4
  • 5
15

The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).

The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".

So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.

Here are some possible choices.

If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.

Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)

If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.

A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.

So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.

As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)

Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.

There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • 4
    From discussion above with Gareth McCaughan, correctly comparing with a relative error essentially amounts to "abs(a-b) <= eps*max(2**-1022,abs(a),abs(b))", that's not something I'd describe as simple and certainly not something I'd have worked out by myself. Also as Steve Jessop points out it is of similar complexity to max, min, any and all, which are all builtins. So providing a relative error comparison in the standard math module seems like a good idea. – Gordon Wrigley Apr 12 '11 at 03:45
  • (7/3*3 == 7*3/3) evaluates True in python. – xApple Aug 30 '13 at 13:53
  • 2
    @xApple: I just ran Python 2.7.2 on OS X 10.8.3 and entered `(7/3*3 == 7*3/3)`. It printed `False`. – Eric Postpischil Aug 30 '13 at 13:57
  • 3
    You probably forgot to type `from __future__ import division`. If you don't do that, there are no floating point numbers and the comparison is between two integers. – xApple Aug 31 '13 at 14:18
  • "The common wisdom that floating-point numbers cannot be compared for equality is inaccurate." Agreed. People do not appreciate that two floats may in fact have *exactly* the same binary representation, and if this is what you want to check then go ahead. As a somewhat contrived example, if the float `b` is copied from `a` at some point and it may or may not be changed by some operation and you'd like to check if it has been modified or not, `a==b` is a perfectly fine test. The original assignment would copy `a` into `b` bit-by-bit the same way as for integers. – sigvaldm May 30 '19 at 10:10
  • @xApple: The sentence in which `7/3*3 == 7*3/3` arises is “If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".” Evaluating the expression with floating-point arithmetic has no relevance to the point being made. – Eric Postpischil May 30 '19 at 11:38
  • 1
    I think it gets funnier if you do `7*3/2.5 == 7/2.5*3` which is the point of the question. – seb Jul 06 '21 at 13:08
  • It seems you're focusing only on calculation errors and are missing representation errors. These do exist in floating point numbers but not in integers, and that creates a number of problems that are unique to floating point numbers. `1` is represented in the data exactly as what is in the text, whereas `0.1` is not. That leads to `1 + 2 == 3` being `True`, whereas `0.1 + 0.2 == 0.3` being `False`. This is not due to a calculation error but rather due to a representation error -- a class of errors we don't have with integers. – Gerhard Aug 21 '23 at 19:34
  • @Gerhard: The floating-point number 0.1 (which is representable only in a decimal format, or a format whose base is a multiple of 10) represents the real number 0.1 exactly and does not represent any other real number or any interval. This is per IEEE 754-2019 clause 3. (Similarly, the binary32 number 0.100000001490116119384765625 represents the real number 0.100000001490116119384765625 and does not represent 0.1.) This is the model used throughout the IEEE 754 standard and texts on floating-point such as *Handbook of Floating-Point Arithmetic* by Muller *et al*… – Eric Postpischil Aug 21 '23 at 19:42
  • … It is a mistake to treat floating-point numbers as if they represent some numbers or intervals other than as specified, as this renders proofs and error analysis incorrect. E.g., the rounding error that results from an operation is a function of the specified values represented by the operands and not by any other numbers or intervals. When a decimal numeral is converted to a binary floating-point format, this is an operation like any other and has rounding errors like any other, and that is how analysis of such operations is performed. – Eric Postpischil Aug 21 '23 at 19:44
  • @Gerhard: The notion that integers do not have “representation error” is laughable. Converting 2.3 to an integer format yields a much larger error than converting it to a floating-point format. Integer arithmetic is a subset of floating-point arithmetic. It is merely used for purposes within its limitations, with programmers learning how to deal with its behaviors that differ from real-number arithmetic, such as truncating in `11/4`. Floating-point should be approached in the same way: Learn how to deal with its behaviors. – Eric Postpischil Aug 21 '23 at 19:45
  • @EricPostpischil Most of what you wrote either confirms what I wrote or has nothing to do with what I wrote. A few notes: (1) "Laughable" is not a factual categorization and doesn't seem adequate here (or in any civil conversation). ... – Gerhard Aug 24 '23 at 17:44
  • @EricPostpischil ... (2) "Converting 2.3 to an integer format" is of course converting a non-integer (2.3 is not an integer) to an integer format. I was talking about the specific problems (that you described) when converting decimal non-integer real numbers like 2.3 into the binary representation used for IEEE floating points _that do not occur_ when converting decimal integers like 23 into the binary representation used for integers by Python. (3) "Integer arithmetic is a subset of floating-point arithmetic." While this may be true in some environments, it is not true in Python. – Gerhard Aug 24 '23 at 17:52
  • @Gerhard: Converting decimal integers to binary integers is like converting values representable in a floating-point format to a floating-point format. You are treating integer formats as “different” from floating-point formats by excluding the values they do not represent. In other words, you have adopted your think for integer formats to only consider the values they are specified to work for and neglecting the problems that occur if we want to use them for other cases. Similarly, modeling floating-point numbers as approximations of real numbers is the wrong model… – Eric Postpischil Aug 24 '23 at 18:10
  • … You should adapt your thinking to use the model they were intended for: Floating-point numbers represent a set of numbers exactly. Approximations occur in the operations, not in the numbers, the same way integer division truncates and similarly to the way integer addition overflows. When floating-point arithmetic is used the way it was designed to be used, or even just thought of the way it was intended to be used, it is easier to work with than trying to force it into a model it was not designed for. – Eric Postpischil Aug 24 '23 at 18:12
  • @EricPostpischil I think you completely miss my point. There isn't anything I could write that I haven't. As far as operations go, we are in agreement. You just don't see that there is something besides the operations that often causes problems (not for me, but for people new to this). If it doesn't for you, good for you. My model works for me. Since you don't quite understand my model, I'm not sure you're qualified to say another model is better. Also, working in Python under the assumption that integer math is a subset of float math is ... problematic at best. – Gerhard Aug 25 '23 at 21:45
  • @Gerhard: What sentence(s) in this answer do you want to be different? – Eric Postpischil Aug 25 '23 at 22:28
  • @EricPostpischil It's not about individual sentences. It's about one issue that's completely missing: the fact that `1 + 2 == 3` evaluates to `True` but `0.1 + 0.2 == 0.3` evaluates to `False` is not due to anything you wrote about. It's due to the fact that the machine numbers match their lexical representation for the integers but not for the floats. You describe what happens once the numbers are in machine representation, but omit the issues stemming from the difference between lexical and machine representation. That's where we have a categorical difference between integers and floats. – Gerhard Aug 27 '23 at 15:31
  • @Gerhard: “the fact that 1 + 2 == 3 evaluates to True but 0.1 + 0.2 == 0.3 evaluates to False is not due to anything you wrote about”: Yes, it is. As I wrote above, “When a decimal numeral is converted to a binary floating-point format, this is an operation like any other and has rounding errors like any other, and that is how analysis of such operations is performed.” Compilation of `0.1 + 0.2 == 0.3` performs conversion of decimal numerals to floating-point numbers. That is an operation. – Eric Postpischil Aug 27 '23 at 15:46
  • @EricPostpischil "That is an operation." This is correct. "this is an operation like any other ..." This is arguably not fully correct. The fact that it occurs between the lexical representation and the binary representation makes it different from problems that occur when performing operations between binary representations. This is probably where our differences stem from. There is a difference. You seem to think it is not relevant, but it is a difference. – Gerhard Aug 28 '23 at 16:59
  • @Gerhard: No, it is not different. The rounding is specified the same in IEEE-754—the nearest representable result in the direction selected by the rounding method being used is produced—and it is analyzed mathematically the same. It introduces a potential error like any other operation, and there is no reason to treat it any differently. How do you want to treat it, and how does the math differ? And how would that be better? – Eric Postpischil Aug 28 '23 at 17:57
  • @EricPostpischil It happens in a different place. Just because it is specified in the same spec doesn't mean that it's the same thing. For me, that's enough to treat it differently in an explanation. In your answer, you talk about "calculations" that are the cause of the problem. In our conversation, you changed to "operations", because writing "0.1" in Python source is not really a calculation, but it causes a (hidden, despite being specified) operation that changes the value of this number. This makes it different from calculations. Agree to disagree? – Gerhard Aug 29 '23 at 18:47
14

I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.

Gareth McCaughan
  • 19,888
  • 1
  • 41
  • 62
  • I don't quite follow what you are doing here, but it is interesting. What is the difference between eps, eps1, eps2 and EPS? – Gordon Wrigley Apr 08 '11 at 13:19
  • `eps1` and `eps2` define a relative and an absolute tolerance: you're prepared to allow `a` and `b` to differ by about `eps1` times how big they are plus `eps2`. `eps` is a single tolerance; you're prepared to allow `a` and `b` to differ by about `eps` times how big they are, with the proviso that anything of size `EPS` or smaller is assumed to be of size `EPS`. If you take `EPS` to be the smallest non-denormal value of your floating-point type, this is very similar to Dawson's comparator (except for a factor of 2^#bits because Dawson measures tolerance in ulps). – Gareth McCaughan Apr 08 '11 at 14:22
  • 2
    Incidentally, I agree with S. Lott that the Right Thing is always going to depend on your actual application, which is why there isn't a single standard library function for all your floating-point comparison needs. – Gareth McCaughan Apr 08 '11 at 14:23
  • @gareth-mccaughan How does one determine the "smallest non-denormal value of your floating-point type" for python? – Gordon Wrigley Apr 09 '11 at 23:09
  • This page http://docs.python.org/tutorial/floatingpoint.html says almost all python implementations use IEEE-754 double precision floats and this page http://en.wikipedia.org/wiki/IEEE_754-1985 says the normalized numbers closest to zero are ±2**−1022. – Gordon Wrigley Apr 09 '11 at 23:34
  • Yup. A few notes. (1) I goofed a few comments earlier: for Dawsonish results you want to take EPS not to be the smallest non-denormal value but something like 2^52 times that. (2) You probably don't *really* want to be too Dawsonish when using doubles, since the corresponding EPS is so very very small. (3) If you're really using an EPS of Dawsonian size, then you might as well just omit it and use `abs(a-b) <= eps*max(abs(a),abs(b))` or something of the sort. – Gareth McCaughan Apr 10 '11 at 09:18
  • @gareth-mccaughan (1) Where does the 2^52 come from? (3) My understanding of the denormal numbers (mostly via http://en.wikipedia.org/wiki/Denormal_number) implies that the second smallest denormal number will be twice as big as the smallest, is that right? in that situation the test will fail for any value of eps less than 0.5! I thought the EPS term was required to prevent this? – Gordon Wrigley Apr 11 '11 at 02:55
  • (1) From the difference between counting ulps (as Dawson does) and comparing with the value of the numbers themselves. Any non-denormal `double` is approximately 2^52 ulps in size. (2) What happened to 2? :-) (3) Sorry, yes, you do need a nonzero `EPS` if you want denormals treated in a sensible way. – Gareth McCaughan Apr 11 '11 at 11:50
  • So what we've essentially done after much discussion is reproduce Dawsons AlmostEqualRelativeOrAbsolute, with the absolute error currently set to the smallest normal number. That makes sense to me, when working with unknown magnitudes the relative error is appropriate and the absolute bit is quashing any problems with numbers very close to zero by treating them all as equal. Also I find the relative error is more intuitive than ulps, making it easier to a appropriate value at the call site. You should update your answer to incorporate some of the discussion. – Gordon Wrigley Apr 11 '11 at 22:57
11

If you want to use it in testing/TDD context, I'd say this is a standard way:

from nose.tools import assert_almost_equals

assert_almost_equals(x, y, places=7) # The default is 7
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
volodymyr
  • 7,256
  • 3
  • 42
  • 45
8

In terms of absolute error, you can just check

if abs(a - b) <= error:
    print("Almost equal")

Some information of why float act weird in Python: Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes

You can also use math.isclose for relative errors.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Rahul Sharma
  • 127
  • 1
  • 3
5

This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:

  • Find minimum precision of the two numbers

  • Round both of them to minimum precision and compare

def isclose(a, b):
    astr = str(a)
    aprec = len(astr.split('.')[1]) if '.' in astr else 0
    bstr = str(b)
    bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
    prec = min(aprec, bprec)
    return round(a, prec) == round(b, prec)

As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)

Example:

>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
CptHwK
  • 115
  • 1
  • 4
  • 2
    The unbounded concept of close will not serve you well. `isclose(1.0, 1.1)` produces `False`, and `isclose(0.1, 0.000000000001)` returns `True`. – kfsone Oct 14 '19 at 21:17
2

For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.

See Fraction from fractions module for details.

eis
  • 51,991
  • 13
  • 150
  • 199
2

I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:

if f1 ==0 and f2 == 0:
    return True
else:
    return abs(f1-f2) < tol*max(abs(f1),abs(f2))
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
IronYeti
  • 73
  • 7
2

If you want to do it in a testing or TDD context using the pytest package, here's how:

import pytest


PRECISION = 1e-3

def assert_almost_equal():
    obtained_value = 99.99
    expected_value = 100.00
    assert obtained_value == pytest.approx(expected_value, PRECISION)
Ethan Chen
  • 649
  • 1
  • 6
  • 17
0

I found the following comparison helpful:

str(f1) == str(f2)
nbro
  • 15,395
  • 32
  • 113
  • 196
Kresimir
  • 2,930
  • 1
  • 19
  • 13
  • it's interesting, but not very practical due to str(.1 + .2) == .3 – Gordon Wrigley May 26 '15 at 21:06
  • str(.1 + .2) == str(.3) returns True – Henrikh Kantuni Mar 12 '16 at 22:38
  • 1
    How is this any different from f1 == f2 -- if they're both close but still different due to precision, the string representations will also be unequal. – MrMas Jan 11 '17 at 16:43
  • 2
    .1 + .2 == .3 returns False while str(.1 + .2) == str(.3) returns True – Kresimir Jan 12 '17 at 16:35
  • 6
    In Python 3.7.2, `str(.1 + .2) == str(.3)` returns False. The method described above works only for Python 2. – Danibix Mar 23 '19 at 16:29
  • But you wouldn't write "str(.1 + .2) == str(.3). You'd be doing something like this: `a, b = .1, .2; s = a + b ; x = .3; str(s) == str(x)` which returns False. It also doesn't test "near equality" like the OP asked. `str(.100000000001) == str(.100000000002)` == False. – kfsone Oct 14 '19 at 21:11
0

This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).

The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.

The is_close function just applies a simple conditional to the rounded up float.

def round_to(float_num, prec):
    return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")

def is_close(float_a, float_b, prec):
    if round_to(float_a, prec) == round_to(float_b, prec):
        return True
    return False

>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False

Update:

As suggested by @stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:

def round_to(float_num, prec):
    return '{:.{precision}f}'.format(float_num, precision=prec)

Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):

def round_to(float_num, prec):
    return f'{float_num:.{prec}f}'

So, we could even wrap it up all in one simple and clean 'is_close' function:

def is_close(a, b, prec):
    return f'{a:.{prec}f}' == f'{b:.{prec}f}'
  • 1
    You don't have to use `eval()` to get parameterized formatting. Something like `return '{:.{precision}f'.format(float_num, precision=decimal_precision)` should do it – stephenjfox Jul 12 '19 at 17:43
  • 1
    Source for my comment and more examples: https://pyformat.info/#param_align – stephenjfox Jul 12 '19 at 17:46
  • 1
    Thanks @stephenjfox I didn't knew about nested formatting. Btw, your sample code lacks the ending curly braces: `return '{:.{precision}}f'.format(float_num, precision=decimal_precision)` – Albert Alomar Dec 08 '19 at 20:35
  • 1
    Good catch, and especially well done enhancement with the f-strings. With the death of Python 2 around the corner, maybe this will become the norm – stephenjfox Dec 09 '19 at 17:00
0

To compare up to a given decimal without atol/rtol:

def almost_equal(a, b, decimal=6):
    return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)

print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True 
Vlad
  • 8,225
  • 5
  • 33
  • 45
0

If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.

from enum import Enum
class HolidayMultipliers(Enum):
    EMPLOYED_LESS_THAN_YEAR = 2.0
    EMPLOYED_MORE_THAN_YEAR = 2.5

Then running:

testable_value = 2.0
HolidayMultipliers(testable_value)

If the float is valid, it's fine, but otherwise it will just throw an ValueError.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ilmari Kumpula
  • 831
  • 1
  • 10
  • 18
-3

Use == is a simple good way, if you don't care about tolerance precisely.

# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True

But watch out for 0:

>>> 0 == 0.00000000000000000000000000000000000000000001
False

The 0 is always the zero.


Use math.isclose if you want to control the tolerance.

The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).


If you still want to use == with a self-defined tolerance:

>>> class MyFloat(float):
        def __eq__(self, another):
        return math.isclose(self, another, rel_tol=0, abs_tol=0.001)

>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True

So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

Yan QiDong
  • 3,696
  • 1
  • 24
  • 25
  • 1
    You can't configure it globally cause it's not applying a tolerance it's comparing the actual bit values. While C Python uses C doubles this is not required in the spec, it may change in the future and other Python variants may do other things. So comparing floats with == may do different things depending on environment. – Gordon Wrigley May 11 '21 at 08:46
  • Yes, I was wrong. `1 + 1e-16 == 1` in Python, just because `1 + 1e-16` is `1.0` after precision lost. – Yan QiDong May 11 '21 at 09:21