2

x^=y^=x^=y; is a tricky/amusing implementation of the XOR swap algorithm in C and C++. It parses as x^=(y^=(x^=y)); and uses the fact that assignment operators return the assigned value. But is it correct? The GCC 10.3.0 C compiler gives me the warning operation on ‘x’ may be undefined [-Wsequence-point] and clang 12.0.0 warning: unsequenced modification and access to 'x' [-Wunsequenced]. Compiling as C++, clang continues to warn the same way, and GCC stops. So is this code correct in either language? It looks rather sequenced to me, but maybe it's illegal to modify a variable two times in the same statement?

As pointed out in this answer, clang++ -std=c++17 does not give the warning. With -std=c++11 the situation is as described above. So maybe my question should be further broken down into C/C++11/C++17.

acupoftea
  • 181
  • 2
  • 11
  • Nothing will annoy your co-workers more than coming across that code during a bug hunt :) – Jeremy Friesner Mar 16 '22 at 05:24
  • 2
    It's a hackish way to swap, just swap in a readable/easily maintainable way and let the compiler optimize. – David C. Rankin Mar 16 '22 at 05:26
  • 1
    The question is asked in a language lawyer spirit, especially in C++ there is no need for this as there is the `std::swap` function. – acupoftea Mar 16 '22 at 05:40
  • It works and it's legal because it doesn't do anything strange other than to rely on bitwise identities to swap two numbers. See [How does XOR variable swapping work?](https://stackoverflow.com/q/249423/3422102) That said, it is one of the most unreadable ways to approach swapping. [XOR swap algorithm - Wikipedia](https://en.wikipedia.org/wiki/XOR_swap_algorithm) has more details on the background and identities. – David C. Rankin Mar 16 '22 at 05:46
  • 2
    *a tricky/amusing implementation* is one way of describing unmaintainable code that has no place in any real-world source code. – Andrew Mar 16 '22 at 06:57
  • Does this answer your question? [How XOR Assignment operator ^= is utilized to reverse an array in c](https://stackoverflow.com/questions/69769815/how-xor-assignment-operator-is-utilized-to-reverse-an-array-in-c) – Ptit Xav Mar 16 '22 at 07:15
  • 2
    This is rather a duplicate of [What made i = i++ + 1; legal in C++17?](https://stackoverflow.com/questions/47702220/what-made-i-i-1-legal-in-c17). Apart from sequencing, I think we have proven [over and over](https://stackoverflow.com/a/70287115/584518) on SO that XOR swapping is never more efficient than temporary variable swapping, but often less efficient. And since it is definitely less readable, the only purposes for using it is either code golf or for the purpose of posing as a bad programmer... – Lundin Mar 16 '22 at 07:35
  • 2
    Before C++17, it is definitely undefined behaviour, precisely because it modifies variables more than once in a statement. C++17 has introduced different sequencing rules so the behaviour is no longer undefined. (Which, IMHO, represents one of a number of poor choices in C++17, because it can arguably be used to justify poor techniques like this one). – Peter Mar 16 '22 at 07:36

1 Answers1

3

Add --std=c++17 to your compiler and you will not get warning anymore.

There is a part that is added to C++17 that prevents undefined behavior and you need that part for it:

In every simple assignment expression E1=E2 and every compound assignment expression E1@=E2, every value computation and side-effect of E2 is sequenced before every value computation and side effect of E1

Though, I suggest that you never use it in your code too.

Afshin
  • 8,839
  • 1
  • 18
  • 53
  • I would like a more detailed explanation. As I see it, there are no computations or side effects on the left sides of assignment expressions here - it's just `x` or `y`. So why does this change matter? (a complete answer should also address the C language) – acupoftea Mar 16 '22 at 05:47
  • 1
    @acupoftea Which initial value of `x` for the first `x` in `x^=(y^=(x^=y))` should a compiler use? The original one or the one obtained after doing `x^=y`? – Evg Mar 16 '22 at 06:02
  • 1
    @acupoftea as you told, your expression is equal to `x^=(y^=(x^=y));` **mathematically**. From mathematically, I mean it does not go directly calculate `x^=y` and replace it. it starts evaluation from start of expression. So it means for whole expression, E1 is `x` and E2 is `y^=(x^=y)`. the quoted part forces that E2 is calculated before E1. – Afshin Mar 16 '22 at 06:02
  • @Afshin hmm, intuitively I'd say it doesn't matter what value the first `x` has if it's getting overwritten anyway. You're saying that despite this, UB happens? And I guess the same goes for the C language? – acupoftea Mar 16 '22 at 06:11
  • @acupoftea this type of UB is somehow similar to `a[i] = i++;` which is undefined before C++17. But anyway, it seems it should be ok over C11 for C. you can check rules here: https://en.cppreference.com/w/cpp/language/eval_order Though it is one of irritating parts of cpp ref. – Afshin Mar 16 '22 at 06:23
  • 3
    The way the issue is formally described in terms of the actual standard is that applying a side effect twice to or applying a side effect to and reading a variable with no sequenced-before relation is UB. Before C++17 there is no sequencing here except at the `;`, so it's instant UB. Also, @acupoftea: the first `x` in `x^=y^=x^=y` *is* being read because `^=` reads the LHS! `x^=y^=x^=y` means `x=x^(y^=x^=y)`. Among other things, the read of `x` from the outer `^=` conflicts with the inner write without the C++17 sequencing. (NB: two writes with no reads would also conflict.) – HTNW Mar 16 '22 at 06:50
  • [Undefined behavior and sequence points](https://stackoverflow.com/q/4176328/995714) – phuclv Mar 16 '22 at 08:47