17

I ran into unexpected results in a python if clause today:

import numpy
if numpy.allclose(6.0, 6.1, rtol=0, atol=0.5):
    print 'close enough'  # works as expected (prints message)

if numpy.allclose(6.0, 6.1, rtol=0, atol=0.5) is True:
    print 'close enough'  # does NOT work as expected (prints nothing)

After some poking around (i.e., this question, and in particular this answer), I understand the cause: the type returned by numpy.allclose() is numpy.bool_ rather than plain old bool, and apparently if foo = numpy.bool_(1), then if foo will evaluate to True while if foo is True will evaluate to False. This appears to be the work of the is operator.

My questions are: why does numpy have its own boolean type, and what is best practice in light of this situation? I can get away with writing if foo: to get expected behavior in the example above, but I like the more stringent if foo is True: because it excludes things like 2 and [2] from returning True, and sometimes the explicit type check is desirable.

Community
  • 1
  • 1
drammock
  • 2,373
  • 29
  • 40
  • 1
    You're using Python 2, where True isn't a keyword, so it's possible that True has been redefined. As a result, `is True` doesn't necessarily prove anything typewise, because `True` might be `("not quite True", 7)`. `if foo is (True is True):` or something should work, though.. (ducks and runs) – DSM Sep 20 '13 at 17:31
  • @DSM: That is true (but `is` it `True`?), but I don't think it's the most serious problem with what he's doing. Also, while your trick would fix the problem for any function that returned the original `True` (from a different module's globals, from a closure or local copy, from a C extension using `Py_True`, etc.), it would _break_ any function that returned the current global `True`. And to debug that, I think you need to read up on Kripke. :) – abarnert Sep 20 '13 at 17:51

2 Answers2

23

You're doing something which is considered an anti-pattern. Quoting PEP 8:

Don't compare boolean values to True or False using ==.

Yes:   if greeting:
No:    if greeting == True:
Worse: if greeting is True:

The fact that numpy wasn't designed to facilitate your non-pythonic code isn't a bug in numpy. In fact, it's a perfect example of why your personal idiom is an anti-pattern.


As PEP 8 says, using is True is even worse than == True. Why? Because you're checking object identity: not only must the result be truthy in a boolean context (which is usually all you need), and equal to the boolean True value, it has to actually be the constant True. It's hard to imagine any situation in which this is what you want.

And you specifically don't want it here:

>>> np.True_ == True
True
>>> np.True_ is True
False

So, all you're doing is explicitly making your code incompatible with numpy, and various other C extension libraries (conceivably a pure-Python library could return a custom value that's equal to True, but I don't know of any that do so).


In your particular case, there is no reason to exclude 2 and [2]. If you read the docs for numpy.allclose, it clearly isn't going to return them. But consider some other function, like many of those in the standard library that just say they evaluate to true or to false. That means they're explicitly allowed to return one of their truthy arguments, and often will do so. Why would you want to consider that false?


Finally, why would numpy, or any other C extension library, define such bool-compatible-but-not-bool types?

In general, it's because they're wrapping a C int or a C++ bool or some other such type. In numpy's case, it's wrapping a value that may be stored in a fastest-machine-word type or a single byte (maybe even a single bit in some cases) as appropriate for performance, and your code doesn't have to care which, because all representations look the same, including being truthy and equal to the True constant.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • 3
    Actually, traditionally I'd think that `if greeting is True` is better than `if greeting == True`; it's for when you want to accept True but not truthy values, a rare scenario but a reasonable one nonetheless. – Veedrac Sep 20 '13 at 17:23
  • @Veedrac: I won't know what you mean by "traditionally", but PEP 8 explicitly calls it "Worse", not "better". Also, `== True` won't accept all truthy values, just values that are equal to `True`, which _already_ excludes things like `[0]`, `"false"`, etc. So, your motivation for doing it is wrong in the first place. – abarnert Sep 20 '13 at 17:25
  • 2
    FYI, you are quoting a portion of PEP8 that is [controversial](https://github.com/jcrocholl/pep8/issues/134#issuecomment-14952244). – drammock Sep 20 '13 at 17:29
  • @drammock: That link doesn't show anyone disagreeing with this part of PEP8, just with the wording that the `pep8` style checker uses to diagnose it. – abarnert Sep 20 '13 at 17:53
  • 2
    Quoting from [that same link](https://github.com/jcrocholl/pep8/issues/134#issuecomment-14952244): "I agree that the general recommendation is if cond: and if not cond:, and this is sufficient for 95% of the use cases. However there are some (rare) cases where we want to check for identity with booleans..." (goes on to give examples). – drammock Sep 20 '13 at 17:55
  • @drammock: The example that it gives is using `True` as a magic sentinel value, which is not at all the same as using it as the expected value from a function that just claims to evaluate to true. – abarnert Sep 20 '13 at 18:18
  • @drammock: More importantly, he explicitly says it's rare, and not what you want 95% of the time, which doesn't at all support the idea that you should do it all the time (as you want to), or that the recommendation is wrong or controversial. Every recommendation in PEP8 has rare cases that are exceptions. The very first section after the introduction explains this at great length. – abarnert Sep 20 '13 at 18:19
  • I'll concede that point. I was conflating PEP8 with the specific message E712, which gives both `if cond is True:` and `if cond:` as equally preferable to `if cond == True:`. It's the E712 message that's controversial, not the PEP8 recommendation iself. – drammock Sep 20 '13 at 18:31
  • 1
    **The repeated appeals to PEP8 in this answer and the commentary above are overly hostile, arguably condescending, and frankly unnecessary.** NumPy developers themselves have openly acknowledged that NumPy's failure to leverage standard Python scalar types is a well-known (albeit unavoidable) design flaw for which no practical NumPy-side solutions exist: ["...if we had a choice, I think that (*i.e., use standard Python scalar types*) may be what we would do (or abolish the scalars completely effectively doing the same)."](https://github.com/numpy/numpy/issues/12950#issuecomment-462800326) – Cecil Curry Feb 27 '19 at 08:47
  • 1
    Tangentially, there are *many* valid justifications for explicit comparison against the `True` and `False` singletons. Implicit boolean comparisons invite edge cases with non-boolean truthiness (e.g., silent coercions of both integer 0 and the empty string to `False`) and hence violate the "Explicit is better than implicit." maxim of [PEP 20](https://www.python.org/dev/peps/pep-0020) – which arguably trumps PEP 8 in this regard. When testing external caller-defined data or input deriving from questionable sources, as example, assuming sane boolean truthiness typically increases attack surface. – Cecil Curry Feb 28 '19 at 08:56
6

why does numpy have its own boolean type

Space and speed. Numpy stores things in compact arrays; if it can fit a boolean into a single byte it'll try. You can't easily do this with Python objects, as you have to store references which slows calculations down significantly.

I can get away with writing if foo: to get expected behavior in the example above, but I like the more stringent if foo is True: because it excludes things like 2 and [2] from returning True, and sometimes the explicit type check is desirable.

Well, don't do that.

Veedrac
  • 58,273
  • 15
  • 112
  • 169