0
/ Swap endian (big to little) or (little to big)
uint32_t num = 9;
uint32_t b0,b1,b2,b3;
uint32_t res;

b0 = (num & 0x000000ff) << 24u;
b1 = (num & 0x0000ff00) << 8u;
b2 = (num & 0x00ff0000) >> 8u;
b3 = (num & 0xff000000) >> 24u;
res = b0 | b1 | b2 | b3;

I got this code from an answer posted at Convert Little Endian to Big Endian

I understand the above steps swap the byte to convert from little to big endian. Why "&" with (0x0000FF00,....) for b0,b1,.. at each step and "|" in the end for the result? Can some one explain these doubts that will help me understand the conversion between endianess.

Community
  • 1
  • 1

2 Answers2

7

It's so you can mask bits off and set bits. At least that's the simple answer. In C, & is the bitwise AND operator and | is the bitwise OR operator (which are a little different than the logical && and || used for boolean operations). Take a look at the truth tables below.

AND       OR
A B X     A B X
0 0 0     0 0 0
0 1 0     0 1 1
1 0 0     1 0 1
1 1 1     1 1 1

A and B are inputs and X is the output. So when you do a 16-bit endian swap, you would use a macro like this:

#define endianswap16(x)  (x << 8) | (x >> 8)

This takes x, does a shift, then ORs the results together to get the endian swap. Take the 32-bit endian swap which uses both & and | in addition to bit shifting:

#define endianswap32(x)  (x << 24) | (x & 0x0000FF00) << 8) \
  | (x & 0x00FF0000) >> 8) | (x >> 24)

Since 32 bits is 4 bytes, this swaps the two outer bytes with each other and then swaps the two inner bytes with each other. Then it does logical ORs to put the 32-bit number back together. The ANDs are used to mask off certian bit positions so when we perform the ORs to reconstruct the number, we don't change the value.

As to your question as to why we do it this way, it's because we have to reverse the order of the bytes. Take 0x12345678 for instance. When stored in memory on both a little and big endian machines, it would look like this:

---> Increasing Memory Address
78 56 34 12   Little Endian
12 34 56 78   Big Endian

Intel and their clones are little endian machines which actually has advantages over the big endian format. Big endian machines are the IBN S/360 and descendants (Z-architecture), Sparc, Motorola 68000 series and PowerPC, MIPS, and others.

There are two big problems when dealing with platforms that differ in endiness:

  • The first one is when exchanging binary data between big and little endian platforms.
  • The second one is when software takes a multibyte value and splits it up into different byte values.

An example of this is Intel machines communicating over the network. The network addresses are in network byte order which is big endian, but Intel machines are little endian. So IP addresses and such need to have their endians swapped for them to be interpreted correctly.

Daniel Rudy
  • 1,411
  • 12
  • 23
  • From my understanding isn't & and | are bitwise operators rather than logical? Are these terms interchangeable? – EnthusiatForProgramming May 23 '15 at 03:54
  • To me, logical and bitwise are interchangeable, but then again, I come from a hardware background and I understand what's going on in the CPU (electronically) when these operations are being performed. So yeah, it really *IS* an AND and OR as those types of logic gates are used to perform the operation. – Daniel Rudy May 23 '15 at 04:04
  • @EnthusiatForProgramming You are correct that for someone with a programming background, bitwise or and logical or are different. Daniel is correct that at a bit-by-bit level, it is of course just a logical operation. – Chris Hayes May 23 '15 at 04:05
  • @ChrisHayes What is the difference between the two, if you don't mind my asking? – Daniel Rudy May 23 '15 at 04:11
  • 3
    (This applies to all languages I'm aware of but I'm ready to accept there are some where it doesn't.) The logical operators, typically `&&` and `||`, take two booleans and output a boolean; the bitwise operators, typically `&`, `|` and `^`, manipulate sets of bits by performing operations on each pair of bits in turn. There's room for a great deal of variation here; for example, the logical operators in many languages support conversion of their operands from "truthy" or "falsy" values, and output things which are "truthy" or "falsy" (but not necessarily `true or false`). JS is one example. – Chris Hayes May 23 '15 at 04:15
  • Great explanation. In case somebody decides to mindlessly copy and paste the macros you wrote, consider adding all the parentheses you'd need to make them stupid-proof. E.g., `#define endianswap16(x) (((x) << 8) | ((x) >> 8))` – ravron May 23 '15 at 04:31
1

It's fairly straightforward bit masking and shifting.

the (num & 0x000000ff) zeros out all but a single byte of the word. The << 24u shifts it by 24 bits, or 3 8-bit bytes, putting it at the other end of the word. The next three lines swap the remaining 3 bytes in a similar manner. Then the b1|b2|... combines those bytes together to make the final word.

See How do you set, clear, and toggle a single bit? and What are bitwise shift (bit-shift) operators and how do they work?. The same operations that work on single bits work on groups of bits, in this case the 8 bits that make a byte.

Community
  • 1
  • 1
AShelly
  • 34,686
  • 15
  • 91
  • 152
  • Though I know what you mean, *word* might be a little loose in terminology, since we're talking about `uint32_t` which may not match the word size. – Chris Hayes May 23 '15 at 04:05