2

Is there a preferred method for doing a logical XOR in python?

For example, if I have two variables a and b, and I want to check that at least one exists but not both, I have two methods:

Method 1 (bitwise operator):

if bool(a) ^ bool(b):
    do x

Method 2 (boolean operators):

if (not a and b) or (a and not b):
    do x

Is there an inherent performance benefit to using either one? Method 2 seems more "pythonic" but Method 1 looks much cleaner to me. This related thread seems to indicate that it might depend on what variable types a and b are in the first place!

Any strong arguments either way?

Community
  • 1
  • 1
dizzyf
  • 3,263
  • 19
  • 29
  • In what way does the other thread not answer your question? Those two are not equivalent, as detailed in the linked thread. – TemporalWolf Oct 17 '16 at 22:25
  • "basic"? You mean "boolean", right? – OneCricketeer Oct 17 '16 at 22:28
  • 1
    @TemporalWolf: Python doesn't have an "xor" boolean operator and I need to simulate that behavior in a script I'm writing. I'm asking specifically about 'pythonic' style/performance for two distinct xor implementations. I'm aware that they are not equivalent. – dizzyf Oct 17 '16 at 22:29
  • @cricket_007: yep, edited. – dizzyf Oct 17 '16 at 22:30
  • "I want to check that at least one exists but not both" - then neither of these options do that, and if you're trying to check whether variables exist, you've probably picked a bad way to structure this part of your program. – user2357112 Oct 17 '16 at 22:31
  • 1
    @dizzyf I would say `def xor(a, b): return (a and not b) or (b and not a)` is the most pythonic way to do it, then call `xor(a, b)` on things, although that assumes `a` and `b` are boolean values. wrap them if needed – TemporalWolf Oct 17 '16 at 22:43
  • @TemporalWolf: I like that! – dizzyf Oct 17 '16 at 22:45
  • @dizzyf it's worth mentioning, because `and` has a [stronger binding](https://docs.python.org/2/reference/expressions.html#operator-precedence), `def xor(a, b): return a and not b or b and not a` works just as well, although is less readable. – TemporalWolf Oct 17 '16 at 22:47
  • 1
    I do not think it is the good idea to mark it as duplicate, as user himself mentioned about the approaches. He is more interested in the which should be preferred and *Why*. – Moinuddin Quadri Oct 17 '16 at 22:50
  • Since `^` is *bitwise* XOR, not logical XOR, I think `bool(a) != bool(b)` is more appropriate and readable? They both work, but applying bitwise operator to logical operation seems wrong. – endolith May 10 '22 at 15:53

2 Answers2

3

One of the alternative way to achieve it is using any() and all() like:

if any([a, b]) and not all([a, b]):
    print "Either a or b is having value"

But based on the performance, below are the results:

  1. Using any() and all(): 0.542 usec per loop

    moin@moin-pc:~$ python -m "timeit" "a='a';b='b';" "any([a, b]) and not all([a, b])"
    1000000 loops, best of 3: 0.542 usec per loop
    
  2. Using bool(a) ^ bool(b): 0.594 usec per loop

    moin@moin-pc:~$ python -m "timeit" "a='a';b='b';" "bool(a) ^ bool(b)"
    1000000 loops, best of 3: 0.594 usec per loop
    
  3. Using (not a and b) or (a and not b): 0.0988 usec per loop

    moin@moin-pc:~$ python -m "timeit" "a='a';b='b';" "(not a and b) or (a and not b)"
    10000000 loops, best of 3: 0.0988 usec per loop
    

Clearly, your (not a and b) or (a and not b) is more efficient. Approximately 6 times efficient then others.


Comparison between few more flavors of and and or:

  1. Using a and not b or b and not a (as pointed by TemporalWolf): 0.116 usec per loop

    moin@moin-pc:~$ python -m "timeit" "a='a';b='b';" "a and not b or b and not a"
    10000000 loops, best of 3: 0.116 usec per loop
    
  2. Using (a or b) and not (a and b): 0.0951 usec per loop

    moin@moin-pc:~$ python -m "timeit" "a='a';b='b';" "(a or b) and not (a and b)"
    10000000 loops, best of 3: 0.0951 usec per loop
    

Note: This performance is evaluated for the value of a and b as str, and is dependent on the implementation of __nonzero__ / __bool__ / __or__ functions as is mentioned by viraptor in comment.

Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
  • `a and not b or b and not a` is equivalent. `and` has a [stronger binding](https://docs.python.org/2/reference/expressions.html#operator-precedence) than `or`. – TemporalWolf Oct 17 '16 at 22:50
  • It depends on the implementation of `__nonzero__` / `__bool__` / `__or__`. For a str it's trivial. For something that does remote call it's not. Op said nothing about the values used. – viraptor Oct 17 '16 at 22:52
  • @anonymous I think you may also see the benefit of `and`/`or` short-circuiting: Running it as a list comprehension only halves the time: `for str in ("(bool(a) ^ bool(b))", "any([a, b]) and not all([a, b])", "a and not b or b and not a"): print timeit.timeit("[%s for a in range(2) for b in range(2)]" % str)` – TemporalWolf Oct 17 '16 at 22:59
  • @viraptor: Agree with you on that. It totally depends on the implementation of these functions. Added your comment with the answer – Moinuddin Quadri Oct 17 '16 at 23:08
  • @TemporalWolf: Added the `timeit` stats for the expression you mentioned with the answer – Moinuddin Quadri Oct 17 '16 at 23:09
  • @anonymous I was more getting at: Your timeit is doing (equivalently) `a = True, b = True`, which means when you run an `and/or` example, the second part never runs, as it short-circuits: The list comprehension I posted forces it to try all four logical combinations, giving a better approximation of the benefit of the short-circuit (instead of letting it do it every time). If you do `not a and b or not b and a` it will be even faster, as that exploits the short-circuiting of `True/True` inputs. – TemporalWolf Oct 17 '16 at 23:16
1

You can make it more readable than reducing the problem to XOR. Depending on the context these may be better:

if sum((bool(a), bool(b))) == 1:  # this naturally extends to more values
if bool(a) != bool(b):

So I think the best way is to go with what matches the actual meaning behind the XOR. Do you want them to not have the same value? Only one of them set? Something else?

If you use ^ and I'm reading the code, I'm going to assume you actually wanted to use bitwise operator and that it matters for some reason.

Is there an inherent performance benefit to using either one?

It's one statement. Unless you know it's a performance issue, it doesn't matter. If it is in a hot loop and your profiler shows you do need to optimise it, then you're likely better off using Cython or some other method of speeding it up.

viraptor
  • 33,322
  • 10
  • 107
  • 191
  • For speed, technically `if (not a) is not (not b):` would be slightly faster and equivalent to `if bool(a) != bool(b):`, at least on CPython, because `not` is syntax (adds a `UNARY_NOT` instruction), while `bool` adds a `LOAD_GLOBAL` and `CALL_FUNCTION` (both more expensive). Switching to `is not` means you follow the code path that only deals with identity equality, no rich comparison machinery. – ShadowRanger Oct 17 '16 at 22:42
  • Sure, but again: "Unless you know it's a performance issue, it doesn't matter." I know straight away what `bool(a)!=bool(b)` does. I have to spend time figuring out what `(not a) is not (not b)` does. – viraptor Oct 17 '16 at 22:58