114

I'm trying to evaluate ((x == a and y == b) or (x == b and y == a)) in Python, but it seems a bit verbose. Is there a more elegant way?

LetEpsilonBeLessThanZero
  • 2,395
  • 2
  • 12
  • 22
  • 8
    Depends on what type of objects `x,y, a,b` are: are they ints/floats/strings, arbitrary objects, or what? If they were builtin types and it was possible to keep both `x,y` and `a,b` in sorted order, then you could avoid the second branch. Note that creating a set will cause each of the four elements `x,y, a,b` to be hashed, which might or might not be trivial or have a performance implication depending entirely on what type of objects they are. – smci Oct 17 '19 at 23:55
  • 18
    Keep in mind the people you work with and the kind of project you're developing. I use a little Python here and there. If someone coded one of the answers here I would totally have to google what's going on. Your conditional is readable pretty much no matter what. –  Oct 18 '19 at 13:54
  • 4
    I wouldn't bother with any of the replacements. They're more terse, but not as clear (IMHO) and I think will all be slower. – Barmar Oct 18 '19 at 14:58
  • 24
    This is a classic XYAB problem. – Tashus Oct 18 '19 at 15:53
  • 2
    @CarlWitthoft I'm in. `if (x+y == a+b) and x in {a,b}: ...` – Andy Oct 18 '19 at 15:55
  • 6
    And in general if you're dealing with order-independent collections of objects, use sets/dicts/etc. This really is an XY problem, we'd need to see the larger codebase it's drawn from. I agree that `((x == a and y == b) or (x == b and y == a))` may look yukky, but 1) its intent is crystal clear and intelligible to all non-Python programmers, not cryptic 2) interpreters/compilers will always handle it well and it can essentially never result in non-performant code, unlike the alternatives. So, 'more elegant' can have serious downsides too. – smci Oct 18 '19 at 17:28
  • @Andy Just imagine if the operator precedence put "==" ahead of "+" in your submission! – Carl Witthoft Oct 18 '19 at 18:06
  • Try using DeMorgans rule: not((not(x == a) or not(y == b)) and (not(x == b) or not (y == a)) – Joel Keene Oct 19 '19 at 20:24
  • 1
    @JoelKeene You should always consider [DeMorgans Law](https://en.wikipedia.org/wiki/De_Morgan%27s_laws) in case it improves readability. But I don't think that's happening here. – candied_orange Oct 20 '19 at 16:34

10 Answers10

156

If the elements are hashable, you could use sets:

{a, b} == {y, x}
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
  • 3
    But that will also cover both equal to a or both equal to b. Perhaps the OP wants this, perhaps not. – Graham Oct 18 '19 at 00:29
  • @Graham if both are equal to `a`, and `a != b` then the above statement is False, is True only if `a == b`, this implies `(x == a and y == b)`. – Dani Mesejo Oct 18 '19 at 00:44
  • 21
    @Graham no it doesn't, because there are exactly two items in the right hand set. If both are a, there's no b, and if both are b, there's no a, and in either case the sets can't be equal. – hobbs Oct 18 '19 at 00:46
  • 4
    Ah, ok. My mistake, sorry. – Graham Oct 18 '19 at 00:59
  • 13
    On the other hand, if we had *three* elements on each side and we needed to test whether they could be matched up one to one, then sets wouldn't work. `{1, 1, 2} == {1, 2, 2}`. At that point, you need `sorted` or `Counter`. – user2357112 Oct 18 '19 at 07:09
  • It's not just about whether there are two or more than two elements on each side, but about whether all the elements are *distinct*. If we compare `{a, b, c}` to `{1, 2, 3}` it still works; – Karl Knechtel Oct 18 '19 at 10:40
  • 11
    I find this tricky to read (don't read the "{}" as "()"). Enough to comment it - and then the purpose is lost. – Édouard Oct 18 '19 at 11:20
  • 9
    I know it's python but creating two new sets just to compare the values seems like an overkill to me... Just define a function which does it and call when needed. – marxin Oct 18 '19 at 13:18
  • 6
    @marxin A function would be even bigger overkill than 2 simple set constructions. And less readable with 4 arguments. – Gloweye Oct 18 '19 at 14:37
  • 1
    This is such a good solution, but it might be clearer for some people as `set(x,y) == set(a,b)` – Andy Oct 18 '19 at 15:56
  • 4
    @Andy: Except that the `set` constructor doesn't actually work like that, so you get a `TypeError`. `set([x, y]) == set([a, b])` works, but compared to `{x, y} == {a, b}`, it runs slower, takes more typing, and involves more syntactic nesting (hurting clarity, particularly when the element expressions are more complex than `x` and `y`). – user2357112 Oct 19 '19 at 02:49
  • 2
    @marxin: Your intuition is probably based on statically or JIT-compiled language implementations, where functions can be easily inlined. [In CPython, introducing a function almost doubles the cost of the test relative to doing it inline, and is sometimes more expensive than building sets.](https://ideone.com/3alqci) – user2357112 Oct 19 '19 at 04:21
  • @Gloweye I've tested it and function is faster, not much but still. But regardless of that I would go for function, as its more explicit. Comparing sets is a nice trick but fundamentally feels wrong to do that, not only for the reason I mentioned before. Using sets compare the hashes, not the objects, so its not even doing the same thing. – marxin Oct 21 '19 at 08:03
  • @marxin It's about readability for me. You'd need a function with a short readable name that takes 4 arguments, and makes clear what it does without having to look it up to get even close to the readability of a simple set comparison like this. Set comparison also fits really well conceptually. The OP wants to compare 2 groups of 2 numbers to check if they are the same. That's exactly what python sets do in comparisons. I'm fine with agreeing to disagree, though. I can see where you're coming from. – Gloweye Oct 21 '19 at 08:09
  • 1
    @marxin: Using sets does not just compare hashes. Hashes are only used as an optimization; set operations check for actual object equality after finding a hash match. – user2357112 Nov 08 '19 at 12:04
62

I think the best you could get is to package them into tuples:

if (a, b) == (x, y) or (a, b) == (y, x)

Or, maybe wrap that in a set lookup

if (a, b) in {(x, y), (y, x)}

Just since it was mentioned by a couple comments, I did some timings, and tuples and sets appear to perform identically here when the lookup fails:

from timeit import timeit

x = 1
y = 2
a = 3
b = 4

>>> timeit(lambda: (a, b) in {(x, y), (y, x)}, number=int(5e7))
32.8357742

>>> timeit(lambda: (a, b) in ((x, y), (y, x)), number=int(5e7))
31.6169182

Although tuples are actually faster when the lookup succeeds:

x = 1
y = 2
a = 1
b = 2

>>> timeit(lambda: (a, b) in {(x, y), (y, x)}, number=int(5e7))
35.6219458

>>> timeit(lambda: (a, b) in ((x, y), (y, x)), number=int(5e7))
27.753138700000008

I chose to use a set because I'm doing a membership lookup, and conceptually a set is a better fit for that use-case than a tuple. If you measured a significant different between the two structures in a particular use case, go with the faster one. I don't think performance is a factor here though.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
  • 12
    The tuple method looks very clean. I'd be concerned about the performance impact of using a set. You could do `if (a, b) in ((x, y), (y, x))`, though? – Brilliand Oct 18 '19 at 05:19
  • 20
    @Brilliand If you're concerned about performance impact, Python is not the language for you. :-D – Sneftel Oct 18 '19 at 10:35
  • 1
    I really like the first method, two tuple comparisons. It's easy to parse (one clause at a time) and each clause is very simple. And top of which, it should be relatively efficient. – Matthieu M. Oct 18 '19 at 11:18
  • Is there any reason to prefer the `set` solution in the answer to the tuple solution from @Brilliand? – user1717828 Oct 18 '19 at 15:28
  • A set requires that objects be hashable, so a tuple would be a more general solution with fewer dependencies. The performance, while probably not generally important, may also look very different when dealing with large objects with expensive hashing and equality checks. – Bernhard Barker Oct 19 '19 at 15:23
  • @Dukeling True, which is why I noted at the bottom to address it in a case-by-case. I like sets, but yes, there may be cases where they aren't appropriate to use. – Carcigenicate Oct 19 '19 at 15:24
  • A single timeit call isn't a great source of 'proof'. Even more so when your proof that tuples are faster is on the best case for tuples and you happily ignore the worst case. – Peilonrayz Oct 20 '19 at 23:14
  • @Peilonrayz It was repeatable, and was 50 million tests over 30 seconds. I was also arguing against using tuples as you can see above, so it's not like I'm inflating my own position by casting them in a good light. I can add another test for tuples if you specify the worst case scenario that you're referring to. I also never used the word "proof", because obviously a time test is never absolute proof of anything. – Carcigenicate Oct 20 '19 at 23:20
  • @Carcigenicate Yes, 50million can suffer from CPU issues, say Spotify is loading a new song or some other background process. I have seen large fluctuations from single timeit calls. One isn't a reliable metric. I never said you said the word proof, ' != ". – Peilonrayz Oct 20 '19 at 23:27
  • @Peilonrayz I'm not sure what point you're trying to make. Someone said there may be a performance issue and I did a test which showed that there seems to be a minimal difference (and again, I independently tested it multiple times, which is about as good as I can do). I then explicitly said at the bottom to do your own testing before coming to a conclusion if you're worried. – Carcigenicate Oct 20 '19 at 23:32
  • @Carcigenicate I don't see how I can be anymore clear - I've said the same thing multiple times in multiple ways. Have a good day. – Peilonrayz Oct 20 '19 at 23:34
32

Tuples make it slightly more readable:

(x, y) == (a, b) or (x, y) == (b, a)

This gives a clue: we're checking whether the sequence x, y is equal to the sequence a, b but ignoring ordering. That's just set equality!

{x, y} == {a, b}
Thomas
  • 174,939
  • 50
  • 355
  • 478
  • `,` creates a tuple, not a list. so `(x, y)` and `(a, b)` are tuples, same as `x, y` and `a, b`. – kissgyorgy Jan 08 '20 at 13:24
  • I meant "list" in the sense of "ordered sequence of elements", not in the sense of the Python `list` type. Edited because indeed this was confusing. – Thomas Jan 08 '20 at 15:52
28

If the items aren't hashable, but support ordering comparisons, you could try:

sorted((x, y)) == sorted((a, b))
jasonharper
  • 9,450
  • 2
  • 18
  • 42
28

The most elegant way, in my opinion, would be

(x, y) in ((a, b), (b, a))

This is a better way than using sets, i.e. {a, b} == {y, x}, as indicated in other answers because we don't need to think if the variables are hashable.

Wagner Macedo
  • 606
  • 5
  • 6
  • How does this differ from [this earlier answer](https://stackoverflow.com/a/58435718/2602718)? – scohe001 Oct 18 '19 at 19:38
  • 5
    @scohe001 It uses a tuple where the earlier answer uses a set. That earlier answer did consider this solution, but declined to list it as a recommendation. – Brilliand Oct 19 '19 at 00:03
25

If these are numbers, you can use (x+y)==(a+b) and (x*y)==(a*b).

If these are comparable items, you can use min(x,y)==min(a,b) and max(x,y)==max(a,b).

But ((x == a and y == b) or (x == b and y == a)) is clear, safe, and more general.

lhf
  • 70,581
  • 9
  • 108
  • 149
  • 2
    Haha, symmetric polynomials ftw! – Carsten S Oct 18 '19 at 16:06
  • 4
    This is the correct answer, in particular the last sentence. Keep it simple, this should not require multiple new iterables or anything like that. For those that want to use sets, take a look at [the implementation of set objects](https://github.com/python/cpython/blob/master/Objects/setobject.c) and then imagine trying to run this in a tight loop... – Z4-tier Oct 18 '19 at 17:52
  • 4
    @RazvanSocol OP didn't say what the data types are, and this answer does qualify the solutions that are type-dependent. – Z4-tier Oct 18 '19 at 17:54
23

As a generalization to more than two variables we can use itertools.permutations. That is instead of

(x == a and y == b and z == c) or (x == a and y == c and z == b) or ...

we can write

(x, y, z) in itertools.permutations([a, b, c])

And of course the two variable version:

(x, y) in itertools.permutations([a, b])
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • 6
    Great answer. It is worth pointing out here (for those that haven't done much with generators before) that this is very memory efficient, as only one permutation is created at a time, and the "in" check would stop and return True immediately after a match is found. – robertlayton Oct 20 '19 at 00:01
  • 2
    It's also worth pointing out that the complexity of this method is `O(N*N!)`; For 11 variables, this can take over a second to finish. (I posted a faster method, but it still takes `O(N^2)`, and starts taking over a second on 10k variables; So it seems this can either be done fast or generally (wrt. hashability/orderability), but not both :P) – Aleksi Torhamo Oct 21 '19 at 06:57
15

You can use tuples to represent your data and then check for set inclusion, like:

def test_fun(x, y):
    test_set = {(a, b), (b, a)}

    return (x, y) in test_set
Nils Müller
  • 151
  • 4
  • 3
    Written as a one-liner and with a list (for non-hashable items), I would consider this to be the best answer. (although I'm a complete Python newbie). – Édouard Oct 18 '19 at 11:02
  • 1
    @Édouard Now there's another answer that's essentially that (just with a tuple instead of a list, which is more efficient anyway). – Brilliand Oct 19 '19 at 01:17
11

You already got the most readable solution. There are other ways to express this, perhaps with less characters, but they are less straight-forward to read.

Depending on what the values actually represent your best bet is to wrap the check in a function with a speaking name. Alternatively or in addition, you can model the objects x,y and a,b each in dedicated higher class objects that you then can compare with the logic of the comparison in a class equality check method or a dedicated custom function.

Frank Hopkins
  • 639
  • 4
  • 12
5

It seems the OP was only concerned with the case of two variables, but since StackOverflow is also for those who search for the same question later, I'll try to tackle the generic case here in some detail; One previous answer already contains a generic answer using itertools.permutations(), but that method leads to O(N*N!) comparisons, since there are N! permutations with N items each. (This was the main motivation for this answer)

First, let's summarize how some of the methods in previous answers apply to the generic case, as motivation for the method presented here. I'll be using A to refer to (x, y) and B to refer to (a, b), which can be tuples of arbitrary (but equal) length.

set(A) == set(B) is fast, but only works if the values are hashable and you can guarantee that one of the tuples doesn't contain any duplicate values. (Eg. {1, 1, 2} == {1, 2, 2}, as pointed out by @user2357112 under @Daniel Mesejo's answer)

The previous method can be extended to work with duplicate values by using dictionaries with counts, instead of sets: (This still has the limitation that all values need to be hashable, so e.g. mutable values like list won't work)

def counts(items):
    d = {}
    for item in items:
        d[item] = d.get(item, 0) + 1
    return d

counts(A) == counts(B)

sorted(A) == sorted(B) doesn't require hashable values, but is slightly slower, and requires orderable values instead. (So e.g. complex won't work)

A in itertools.permutations(B) doesn't require hashable or orderable values, but like already mentioned, it has O(N*N!) complexity, so even with just 11 items, it can take over a second to finish.

So, is there a way to be as general, but do it considerably faster? Why yes, by "manually" checking that there's the same amount of each item: (The complexity of this one is O(N^2), so this isn't good for large inputs either; On my machine, 10k items can take over a second - but with smaller inputs, like 10 items, this is just as fast as the others)

def unordered_eq(A, B):
    for a in A:
        if A.count(a) != B.count(a):
            return False
    return True

To get the best performance, one might want to try the dict-based method first, fall back to the sorted-based method if that fails due to unhashable values, and finally fall back to the count-based method if that too fails due to unorderable values.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
Aleksi Torhamo
  • 6,452
  • 2
  • 34
  • 44
  • `counts` is equivalent to [`collections.Counter`](https://docs.python.org/3/library/collections.html#collections.Counter), which is implemented in C in Python 3 so should be faster, hopefully. – wjandrea Jun 16 '20 at 22:16