A single bit is no easier than an arbitrary bitmask, so lets just talk about that. You can always call this function with 1U << bitpos
.
If a bit position is the same in both values, no change is needed in either. If it's opposite, they both need to invert.
XOR with 1 flips a bit; XOR with 0 is a no-op.
So what we want is a value that has a 1
everywhere there's a bit-difference between the inputs, and a 0 everywhere else. That's exactly what a XOR b
does. Simply mask this to only swap some of the bits, and we have a bit-swap in 3 XORs + 1 AND.
// call with unsigned char mask = 1U << bitPosition; if you want
inline
void swapBit_char(unsigned char *A, unsigned char *B, unsigned char mask)
{
unsigned char tmpA = *A, tmpB = *B; // read into locals in case A==B
unsigned char bitdiff = tmpA ^ tmpB;
bitdiff &= mask; // only swap bits matching the mask
*A = tmpA ^ bitdiff;
*B = tmpB ^ bitdiff;
}
(Godbolt compiler explorer with gcc for x86-64 and ARM, includes a version with unsigned
instead of unsigned char
.)
You could consider if(bitdiff) { ... }
, but unless you're going to avoid dirtying a cache line in memory by avoiding the assignments, it's probably not worth doing any conditional behaviour. With values in registers (after inlining), a branch to save two xor
instructions is almost never worth it.
This is not an xor-swap. It does use temporary storage. As @chux's answer demonstrates, a masked xor-swap requires 3 AND operations as well as 3 XOR. (And defeats the only benefit of XOR-swap by requiring a temporary register or other storage for the &
results.)
This version only requires 1 AND. Also, the last two XORs are independent of each other, so total latency from inputs to both outputs is only 3 operations. (Typically 3 cycles).
For an x86 asm example, see this code-golf Exchange capitalization of two strings in 14 bytes of x86-64 machine code (with commented asm source)