It's so you can mask bits off and set bits. At least that's the simple answer. In C, & is the bitwise AND operator and | is the bitwise OR operator (which are a little different than the logical && and || used for boolean operations). Take a look at the truth tables below.
AND OR
A B X A B X
0 0 0 0 0 0
0 1 0 0 1 1
1 0 0 1 0 1
1 1 1 1 1 1
A and B are inputs and X is the output. So when you do a 16-bit endian swap, you would use a macro like this:
#define endianswap16(x) (x << 8) | (x >> 8)
This takes x, does a shift, then ORs the results together to get the endian swap. Take the 32-bit endian swap which uses both & and | in addition to bit shifting:
#define endianswap32(x) (x << 24) | (x & 0x0000FF00) << 8) \
| (x & 0x00FF0000) >> 8) | (x >> 24)
Since 32 bits is 4 bytes, this swaps the two outer bytes with each other and then swaps the two inner bytes with each other. Then it does logical ORs to put the 32-bit number back together. The ANDs are used to mask off certian bit positions so when we perform the ORs to reconstruct the number, we don't change the value.
As to your question as to why we do it this way, it's because we have to reverse the order of the bytes. Take 0x12345678 for instance. When stored in memory on both a little and big endian machines, it would look like this:
---> Increasing Memory Address
78 56 34 12 Little Endian
12 34 56 78 Big Endian
Intel and their clones are little endian machines which actually has advantages over the big endian format. Big endian machines are the IBN S/360 and descendants (Z-architecture), Sparc, Motorola 68000 series and PowerPC, MIPS, and others.
There are two big problems when dealing with platforms that differ in endiness:
- The first one is when exchanging binary data between big and little
endian platforms.
- The second one is when software takes a multibyte value and splits it up into different byte values.
An example of this is Intel machines communicating over the network. The network addresses are in network byte order which is big endian, but Intel machines are little endian. So IP addresses and such need to have their endians swapped for them to be interpreted correctly.