Why is there no inbuilt swap function in C but there is xchg in Assembly?

Question

Recently I came across Assembly language. x86 assembly has an xchg instruction which swaps the contents of two registers.

Since every C code is first converted to Assembly, it would have been nice if there was a swap function inbuilt in C like in the header stdio.h. Then whenever the compiler detects the swap function, it could add the xchg directive in the assembly file.

So why this swap function was not implemented in C?

At the time C was developed, this may not have been a common instruction, so they didn't see the point of it. — Barmar, Nov 09 '16 at 17:09
"Assembly has an xchg directive" *Which* assember? *What* platform? — Andrew Henle, Nov 09 '16 at 17:17
There is also no rotate function in C, but in many machine languages. And which "Assembly" do you refer to? Many CPUs don't have such an instruction. — too honest for this site, Nov 09 '16 at 17:29
C doesnt have separate logical and arithmetic shift right operations. C doesnt support a carry bit. C doesnt have direct support for an overflow for add or multipy, doesnt have direct support for a borrow on subtract. Doesnt directly in the language support floating point round up, round down, round to zero options. Divide by zero. And the answer is it is generic there is no reason to supply those, any more than you would in Pascal, Python, JAVA, or any other high level language. — old_timer, Nov 09 '16 at 18:22
most CPUs (and microprocessors/microcontrollers, etc) do NOT have such an instruction. Even now, the best that is usually available is to swap the high/low nibbles of a byte — user3629249, Nov 10 '16 at 04:09
Only a really dumb compiler would want to actually emit `xchg` every time the source swapped variables. It's not faster than 3 `mov` instructions, and a good compiler can simply change its local-variable <-> CPU register mapping without emitting any asm instructions. (Or inside a loop, unrolling can often optimize away swapping.) Often you need only 1 or `mov` instructions in asm, not all 3. See also [Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures?](https://stackoverflow.com/q/45766444) — Peter Cordes, Oct 29 '18 at 03:36

Eugene Sh. · Accepted Answer · 2016-11-09T17:25:15.783

8

C is a cross-platform language. Assembly is architecture specific. Not every architecture has such an instruction. Moreover, C, as a high-level language doesn't have to correspond to the machine-level instruction set and features, as it's purpose is to bridge between the "human" language and the machine language, not to mimic it. Said that, a C compiler for this specific architecture might have an extension for this swapping instruction or optimize the swapping code to use this instruction if smart enough.

edited Nov 09 '16 at 17:25

answered Nov 09 '16 at 17:06

Eugene Sh.

17,802
8
40
61

3

Language features don't have to correspond to instructures that every architecture has. You can implement C on architectures with no floating point hardware, it gets done in software instead. – Barmar Nov 09 '16 at 17:09
1

An interesting question remains: why should this (suggested (and on x86) native function) not be emulated on other architectures than x86 by the C compiler? – zx485 Nov 09 '16 at 17:14
@zx485 Do you find it useful enough to be added as a part of the language? I don't... – Eugene Sh. Nov 09 '16 at 17:18
1

@EugeneSh.: Well...yes. There are **many** algorithms that could make use of it, e.g. _BubbleSort_ as a simple example. – zx485 Nov 09 '16 at 17:20
4

@zx485 Well, I guess if there are enough supporters, it will make it's way into the standard at some point. – Eugene Sh. Nov 09 '16 at 17:23
@zx485: The C standard library already includes a sort function (`qsort`). Now, there may be occasions when you need to roll your own, because you need a guarantee of a stable sort and/or O(n log n) worst-case running time. But how many non-sorting uses would a `swap` function have in C? (C++ code uses `swap` a lot, but for reasons related to copy-construction, overloaded assignment, and destructors, which C doesn't have.) – dan04 Nov 09 '16 at 17:27
@zx485 - at a guess, because C was around quite some time before the x86 architecture was. It seems odd that a language be altered/supplemented by something that achieves the same task as just one(some) of the many microprocessor archs it runs on. – enhzflep Nov 09 '16 at 17:28
And still, there is a question about the semantics. What should it swap? Numbers? Which numbers? Or arbitrary-typed objects? In any case it will have to work with pointer arguments. It will involve some extra checks and calls and branches. I believe it doesn't worth this hassle, since it can be implemented in three straight-forward statements. – Eugene Sh. Nov 09 '16 at 17:32
1

@enhzflep: That may have been so, but the task of swapping two (memory/whatever) locations had been around far longer. So why has it not been included in an "very old" _high level language_? – zx485 Nov 09 '16 at 17:32
@zx485: No one keeps you from writing a simple function using machine-dependent inline-assembler or leave it to the compiler to recognise the pattern and emit the instruction if available. It simply makes no sense to provide some built-in for every possibly useful machine instruction. Why do you thing there is no rotate-operator in C, no parity-operator or string-compare/-copy operator? All instructions some CPUs provide. Read what K.I.S.S. means. – too honest for this site Nov 09 '16 at 17:33
@EugeneSh.: Swapping two values only makes sense on simple types and possibly very small `struct`s. For normal `struct`s as well as arrays, most times it is better to use pointers. And experienced programmers have their standard library for such misc. stuff anyway. I can't rememberif I ever had use for such a swap-function in my projects. And that's more than 30 years of programming. – too honest for this site Nov 09 '16 at 17:39
1

This discussion gets nowhere. No one has kept me from writing such a function. In fact, I did - several times. But a native C/ASM (keyword/opcode) would have been handy. I just ascertained. Nothing more. – zx485 Nov 09 '16 at 17:40
@zx485: C has no **op-codes**! You might want to learn more than x86 assembly before recommending new features for high-level languages. As I wrote: most CPUs with GP registers don't even have such an instruction! Just because **you** swap values frequently does not mean we all do. Even if: what keeps you from writing your own generic function once and forever? – too honest for this site Nov 09 '16 at 17:44
1

@Olaf: Of course C has not opcodes, but it has keywords. That's why I wrote in two-tuples: _(C/ASM)_ - _(keyword/opcode)_. As I said: this discussion is going nowhere. BTW: you never tried to swap two values? – zx485 Nov 09 '16 at 17:48
No, I never **tried**. I can't remember to have need for swapping two values in my serious projects. And **iff** I did, I did not try, but just did it. There is a reason it is not part of any language itself, except for stack-oriented ones like Forth. – too honest for this site Nov 09 '16 at 17:52
@zx485, you ask "*why has it not been included in an "very old" high level language?*", but those of us who did not participate in C's invention or standardization can only interpret history here. My interpretation is two-pronged: (1) C was not initially given such an operation because the languages that inspired C did not have one, because the PDP-7 on which C was first implemented had no corresponding machine instruction, and because it is not a primitive operation with respect to memory locations. (2) No one since has considered adding such an instruction to be sufficiently valuable. – John Bollinger Nov 09 '16 at 18:43
1

@JohnBollinger C started off on the PDP-11, not the PDP-7. That aside, [Wikipedia](https://en.wikipedia.org/wiki/PDP-11_architecture#Instruction_set) and the [handbook PDF](http://bitsavers.trailing-edge.com/pdf/dec/pdp11/handbooks/PDP11_Handbook1979.pdf) it references support your guess about a lack of corresponding instructions. The only instruction that swaps things is `SWAB`, but that only swaps the high and low order bytes of the given word, hardly a general purpose swapping operation. – 8bittree Nov 09 '16 at 19:45
@8bittree, oops, yes, it was B that was written for the PDP-7. Differences between the PDP-7 and the PDP-11 were among the reasons for C's creation and the influences on its design. Language historians may find [Ritchie's essay on C's early history](https://www.bell-labs.com/usr/dmr/www/chist.html) an interesting read. – John Bollinger Nov 09 '16 at 20:09

score 5 · Answer 2 · edited Nov 09 '16 at 23:02

That would work for variables that fit in the register and are in the register. It would not work for large struct or variables held in memory (If you load a variable A in reg X and another, say B in reg Y, and swap them, you could skip the swapping and load A in Y and B in X directly).

Having said said, nothing prevent the compiler for a given architecture to use the swap instruction to compile:

 int a;
 int b;
 int tmp;
 tmp=a;
 a=b;
 b=tmp;

... If those happens to be in registers: the fact that it is not in C does not mean the compiler does not use it.

score 5 · Answer 3 · answered Nov 09 '16 at 17:56

There are two points which can explain why swap() is not in C

1. Function call semantics:
Including a swap() function would break a very fundamental design decision in C: swap() can only work with pass-by-reference semantics (which C++ added to the language, but which are absent in C), not with pass-by-value.

2. Diversity of available assembler instructions
Apart from that, there is usually quite a number of assembler instructions on any given CPU architecture which are totally inaccessible from pure C. This includes instructions as diverse as interrupt handling instructions, virtual memory space manipulating instructions, I/O instructions, bit fiddling instructions (google the PPC instruction rlwimi for an especially powerful example of this), etc.

It is simply impossible to include any significant number of these in a general purpose language like C.

Some of these are crucial for implementing operating systems, which is why any OS must include at the very least some small amounts of assembler code. They are usually encapsulated in some functions with inline assembler or defined in the kernel headers as preprocessor directives. Other instructions are less important, or only good for optimizations, these may be generated by optimizing compilers, and many compilers do generate them (the whole class of vector functions fall in this category).

In the face of this vast diversity, the designers of C just had to cut it somewhere. And they opted for providing whatever is representable as simple operators like (+, -, ~, &, |, !, &&, ||, etc.), but did not provide anything that would require function call syntax like the swap() function you propose.

score 3 · Answer 4 · answered Oct 29 '18 at 03:49

Besides what the other correct answers say, another part of your premise is wrong.

Only a really dumb compiler would want to actually emit xchg every time the source swapped variables, whether there's an intrinsic or operator for it or not. Optimizing compilers don't just transliterate C into asm, they typically convert to an SSA internal representation of the program logic, and optimize that so they can implement it with as few instructions as possible (or really in the most efficient way possible; using multiple fast instructions can be better than a single slower one).

xchg is rarely faster than 3 mov instructions, and a good compiler can simply change its local-variable <-> CPU register mapping without emitting any asm instructions in many cases. (Or inside a loop, unrolling can often optimize away swapping.) Often you need only 1 or mov instructions in asm, not all 3. e.g. if only one of the C vars being swapped needs to stay in the same register, you can do:

# start:   x in EAX,  y in ECX 
mov    edx, eax
mov    eax, ecx
# end:     y in EAX,  x in EDX

See also Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures?

Also note that xchg [mem], reg is atomic (implicit lock prefix), and thus is a full memory barrier, and much slower than 3 mov instructions, and with much higher impact on surrounding code because of the memory-barrier effect.

If you do actually need to exchange registers, 3x mov is pretty good. Often better than xchg reg,reg because of mov elimination, at the cost of more code-size and a tmp reg.

There's a reason compilers never use xchg. If xchg was a win, compilers would look for it as a peephole optimization the same way they look for inc eax over add eax,1, or xor eax,eax instead of mov eax,0. But they don't.

(semi-related: swapping 2 registers in 8086 assembly language(16 bits))

score 1 · Answer 5 · answered Nov 09 '16 at 17:49

Even though xchg is a very elementary instruction, this doesn't mean C must have its equivalent. The fact that C sometimes maps directly to assembly is not very relevant; the standard says nothing about "assembly" (why map to assembly and not another low-level language?).

You might also ask: Why does C not have built-in vector instructions? They're becoming largely available!

There's also compiler's help: swapping variables is a very visible pattern, so such optimization shouldn't be hard to implement. And you also have inline asm, should you need it.

Why is there no inbuilt swap function in C but there is xchg in Assembly?

5 Answers5