8

I was always wondering why such a simple and basic operation like swapping the contents of two variables is not built-in for many languages.

It is one of the most basic programming exercises in computer science classes; it is heavily used in many algorithms (e.g. sorting); every now and then one needs it and one must use a temporary variable or use a template/generic function.

It is even a basic machine instruction on many processors, so that the standard scheme with a temporary variable will get optimized.

Many less obvious operators have been created, like the assignment operators (e.g. +=, which was probably created for reflecting the cumulative machine instructions, e.g. add ax,bx), or the ?? operator in C#.

So, what is the reason? Or does it actually exist, and I always missed it?

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
mr_georg
  • 3,635
  • 5
  • 35
  • 52
  • Good questions. I've wondered about this too, though the responses make sense. – pro3carp3 Nov 13 '08 at 16:33
  • 1
    The SWAP statement *does* exist in QBasic. – dan04 Jan 18 '11 at 03:14
  • More interesting than a 'swap two variables' macro would be something that behaved like the comma operator, but whose value was that of the first operand. Imagine the operator were written as "=:" (I' really don't know what notation would be good). Then "x=(y =: y=x)" would swap x and y. Other uses would be, e.g. "return (y =: y+=4)" would be conceptually similar to the post-increment operator, but incrementing by 4 rather than one. – supercat Aug 29 '11 at 02:25
  • 1
    @supercat: "x, y = y, x" in python, called parallel assignment. – Jürgen Strobel Oct 21 '11 at 00:29
  • @Jürgen Strobel: I would envision my operator as being useful in many more contexts. From what I understand, the above instruction asks Python to evaluate everything on the right, and then assign them to the things on the left, in place order. I was thinking more in terms of scenarios where one wants to have a function return modify something, but return the value it had prior to the modification. – supercat Oct 24 '11 at 00:00
  • @supercat: "y = x; x += 1; return y". Not so hard and totally clear. Just read the official C++ standard and behold the ultimate mess and complexity e.g. "x++ * x++" creates for a sound argument against post-increment/decrement. – Jürgen Strobel Oct 24 '11 at 00:29
  • @Jürgen Strobel: My idea would be for an operator which, like the comma operator, had defined sequence-point behavior, but which used the left-hand operand rather than the right. Two sequence points would be established by the special operator--one before it, and one before the result of the expression is used in any other expression. The increment and operators cause confusion in significant measure because they don't have any defined sequence points. The problem with your formulation is that it requires defining a new temporary variable y. – supercat Oct 24 '11 at 00:43
  • @supercat: Its off topic really. But how do you think the compiler/interpreter would treat this any way? Magically do the later part after returning, or introduce exactly the same temporary variable and return it after the full calculation? This is useless and confusing syntactic sugar. – Jürgen Strobel Oct 24 '11 at 00:52
  • @Jürgen Strobel: Since the question is why no swap/exchange operator exists, it seems on-topic. A compiler could implement the aforementioned operator by creating a new temporary variable, but that doesn't make it useless syntactic sugar. I'll readily admit the operator needs some combination of characters other than :=, but creating variables which are written and read precisely once within what is functionally one expression seems a bit icky. I don't know that the expression would yield code that a modern optimizer couldn't, but... – supercat Oct 24 '11 at 04:27
  • @Jürgen Strobel: ...the most natural generated code for the expression would differ from what an optimizer would generate. For example, "return n :=: n+=5;" on the x86 would yield "mov eax,[n] / add dword [n],5", whereas the normal code with a temp would be "mov eax,[n] / mov [ebp+_temp],ax / add dword [n],5 / mov eax,[ebp+_temp]". A redundant store and a redundant load. An optimizer might nix those instructions, but if a language featured such an operator and people got used to it, I would aver that some types of expression would be more idiomatic with it than with a temp. – supercat Oct 24 '11 at 04:31
  • @supercat: optimizers are a lot smarter than you think, and premature micro-optimization like that is usually "... the root of all evil" (D. Knuth), especially in the context of the question's "many languages". – Jürgen Strobel Oct 24 '11 at 12:02
  • @Jürgen Strobel: Optimizers in the small embedded-systems world tend to lag behind those for "full-size" processors, but even if they didn't, temporary variables often clutter code and impair legibility. Perhaps the situation would be better if there were a common convention for writing scoping blocks that aren't associated with control structures (e.g. "#define SCOPING_BLOCK 1 ... if (SCOPING_BLOCK) {int whatever; do_something();}"), but I don't know of any common conventions for that. I dislike defining variables mid-block, because it's hard for a reader to know their real scope. – supercat Oct 24 '11 at 14:44
  • @supercat Temporary variables only clutter code and impair legibility in situations similar to making one named `temp` and reusing that same variable every time you need a temporary variable. (In other words, there's no problem if you just name your variables and comment your code properly. Which, admittedly, can be strangely hard for many people, and is hard for anyone when they're in a rush.) Oh, and by the way, you can just use a pair of curly brackets to create a new inner scope in C. http://ideone.com/6mtsy – JAB Jun 15 '12 at 20:38
  • @JAB: One can use `{` and `}` to define a block that isn't part of a control statement, but it can be tough to distinguish visually between a deliberately-placed scoping-only compound statement, and one which is leftover from a clumsy-fingered editing operation. One can add a comment, but in the absence of a standard convention, a programmer who noticed the odd-looking block when proof-reading would have to actually read the comment to recognize the intention. – supercat Jun 15 '12 at 20:38
  • A programmer who doesn't take the time to read comments ends up wasting more time than a programmer who does. (As an aside, I just noticed the mention of C#'s `??` operator, and it reminded me that Python allows you to use `or` similarly, as `bool(None)` == `False` and the return value of a [short-circuiting] boolean operation is the last operand evaluated.) – JAB Jun 15 '12 at 20:48

5 Answers5

11

In my experience, it isn't that commonly needed in actual applications, apart from the already-mentioned sort algorithms and occasionally in low level hardware poking, so in my view it's a bit too special-purpose to have in a general-purpose language.

As has also been mentioned, not all processors support it as an instruction (and many do not support it for objects bigger than a word). So if it were supported with some useful additional semantics (e.g. being an atomic operation) it would be difficult to support on some processors, and if it didn't have the additional semantics then it's just (seldom used) synatatic sugar.

The assigment operators (+= etc) were supported because these are much more common in real-world programs - and so the syntacic sugar they provide was more useful, and also as an optimisation - remember C dates from the late 60s/early 70s, and compiler optimisation wasn't as advanced (and the machines less capable, so you didn't want lengthy optimisation passes anyway).

Paul

The Archetypal Paul
  • 41,321
  • 20
  • 104
  • 134
  • Nice answer. But there are more algorithms in which it is used: Fisher-Yales algorithm (modern version), Steinhaus-Johnson-Trotter algorithm, Matrix-Tranposition. And in more specialized ones, where it is used as argument for reducing space complexity. I admit that these are specialized cases. – mr_georg Nov 13 '08 at 13:01
  • 1
    "these are specialized cases" Exactly. – The Archetypal Paul Nov 13 '08 at 13:23
  • 1
    An optimizing compiler can still use native swap instructions whenever it spots the pattern, so it's not even needed for the cited specialized cases. – Jürgen Strobel Oct 21 '11 at 00:23
  • 1
    Some languages (like Python) have parallel assignment, which is more generic and not tied to symmetrically swapping exactly 2 values. – Jürgen Strobel Oct 21 '11 at 00:25
5

C++ does have swapping.

#include <algorithm>
#include <cassert>

int
main()
{
    using std::swap;
    int a(3), b(5);
    swap(a, b);
    assert(a == 5 && b == 3);
}

Furthermore, you can specialise swap for custom types too!

C. K. Young
  • 219,335
  • 46
  • 382
  • 435
3

It's a widely used example in computer science courses, but I almost never find myself needing it in real code - whereas I use += very frequently.

Yes, in sorting it would be handy - but you don't tend to need to implement sorting yourself, so the number of actual uses in source code would still be pretty low.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Swapping two lvalues isn't common, but it's pretty common to want to want to store and read an lvalue simultaneously. In a sense, that's what the postfix ++ and -- operators do (they do a store of a modified value, but the value propagated to the rest of the expression is the pre-update value). – supercat Oct 21 '11 at 14:47
0

You do have the XOR operator that does a variable substitution for primitive type...

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • That can cause an error if the two things you're swapping have the same storage location though. It'll zero them out. I've also been told that on modern processors, it's less efficient than just using a temporary variable. – Rob Rose Apr 06 '17 at 03:11
-1

I think they just forgot to add it :-) Yes, not all CPUs have this kind of instructions, so what ? We have bunch of other things that most CPUs don't have instructions to compute. It would be much easier/clearer and also faster ( by intrinsic ) if we had it !!!

Malkocoglu
  • 2,522
  • 2
  • 26
  • 32
  • Wouldn't necessarily be any faster - the compiler could inline the call to swap (and I would imagine some already do) if it really is performance critical. – The Archetypal Paul Nov 13 '08 at 09:15
  • 1
    What about the underlying hardware, if it had XCHG or similar instruction, swap would be done without needing a Temporary variable and it would execute faster. There is more to intrinsic than inlining ! – Malkocoglu Nov 14 '08 at 08:19
  • ...and there's more to optimization than inlining. Compilers are capable of figuring out when a swap is occurring (it's not *that* hard to identify a temp variable and use XCHG instead). – Imagist Sep 19 '09 at 06:58