0

I am trying to understand bitwise operators better. I have a number of type uint32_t which I am trying to output byte by byte. The code that does that is:

void printByteWise(uint32_t num) {

  printf("byte 1 = %u\n", (num & 0xFF000000));
  printf("byte 2 = %u\n", (num & 0x00FF0000));
  printf("byte 3 = %u\n", (num & 0x0000FF00));
  printf("byte 4 = %u\n", (num & 0x000000FF));
}

Say num in the above code sample is 9. Then the byte array should be stored in memory like so:
09 00 00 00 (in increasing order of addresses). If so, the output should be:
byte 1 = 09
byte 2 = 00
byte 3 = 00
byte 4 = 00
but the output I get is:

byte 1 = 0
byte 2 = 0
byte 3 = 0
byte 4 = 9

I am on a system that is little-endian which is obtained like so:

int is_bigEndian() {
  int i = 1;
  char *low = (char *) (&i);
  return *low ? 0 : 1;
}  

Is this behaviour correct? Why am I seeing this behaviour?

Sriram
  • 10,298
  • 21
  • 83
  • 136
  • The result of `num & 0xFF000000` is not a byte. It's still a 32-bit integer with the 24 lower-order bits set to 0. – Fred Foo Feb 12 '13 at 10:44
  • 1
    %u tells printf to expect an unsigned int, not a uint32_t. For this reason, your code invokes undefined behaviour. Consider using the PRIu32 macro as a format string, for uint32_t values. – autistic Feb 12 '13 at 11:20
  • In addition, you can't use is_bigEndian to test for little endianness, because of the possibility of mixed endian systems. – autistic Feb 12 '13 at 11:22
  • @modifiablelvalue: and how do I check for mixed-endianness? – Sriram Feb 12 '13 at 12:15
  • @Sriram It's not important. A much better idea is to write code that doesn't care about the underlying representation of integers. For example, use sscanf and sprintf to decode and encode integer values as strings of decimal digits when recieving and sending them over the internet. – autistic Feb 13 '13 at 05:59

2 Answers2

3

Remember, both of your operands are the same endianness.

On little endian, yes, 9 will be stored as 0x09000000. You are then masking with 0xFF000000, which will be stored in memory as 0x000000FF as will therefore be used in that pattern as the mask.

If you want to see the effect fully, do as Ali Veli says and iterate over memory byte-by-byte by using a char pointer.

slugonamission
  • 9,562
  • 1
  • 34
  • 41
  • but is the `0xFF000000` not a constant? is that also "stored" in memory? This looks like a noob-ish question but I am trying to learn. – Sriram Feb 12 '13 at 10:46
  • @Sriram - Yes, it's a constant, but it is also located within your program. Performing the OR operation would be something similar to `OR %eax,#FF000000`, where the constant is in your program and still issued to your processor, then "interpreted" using the same endian-ness. Remember, programs aren't magic, the instructions still need to be encoded and stored ;). – slugonamission Feb 12 '13 at 10:47
  • how exactly eax is OR'ed with FF000000 here? where are operands stored for OR to work on them during execution time? – Hayri Uğur Koltuk Feb 12 '13 at 10:51
  • Assume `eax` already contains 9 for now. The operands are stored in memory, then cache, then issued to the processor in the instruction. In this case, your immediate value (0xFF000000) encoded within the instruction, so on issue, the entire OR instruction (including the immediate) is issued and stored in an instruction buffer. This is then decoded by the processor, which sets up the ALU. Finally, it is then executed. Even if you loaded 0xFF000000 into a register and did `OR %eax %ebx`, you would have the same effect. – slugonamission Feb 12 '13 at 10:58
  • but this answer here covering bit shift operators makes it more confusing: http://stackoverflow.com/a/7184905/350685 – Sriram Feb 12 '13 at 13:18
  • Right. Let's assume a RISC architecture that doesn't support operating directly to memory. Assume memory is little endian and the registers are big endian. You load the values into the processor by using instructions expressing the data in little endian format. It then crosses the boundary into big endian. You then perform the OR operation, where they are still in big endian. You then store into RAM, which crosses the boundary back into little endian. – slugonamission Feb 12 '13 at 13:28
  • In short, you don't need to care what endian-ness the processor is. As long as you conform to its interface, it's fine,. – slugonamission Feb 12 '13 at 13:29
0

With bitwise operators, your operand actually is the number in register not in memory. So there's nothing to do with endianness here, this is expected and correct.

If you inspect the variable by casting it's address to char * or something and go over bytes by increasing the value of pointer, then you will be reading a byte from memory each time (let's say cache is transparent) and there you would see the effect of endianness.

Hayri Uğur Koltuk
  • 2,970
  • 4
  • 31
  • 60
  • i'd be grateful if downwoter says why – Hayri Uğur Koltuk Feb 12 '13 at 10:43
  • Undid that, since much of your answer is correct, but this has nothing to with memory vs. registers. – Fred Foo Feb 12 '13 at 10:44
  • well, afaik operands of bitwise operators are on registers, for example, then i can't explain why shifting 1 position to left always mutliplies number by 1, regardless of endianness. if you have more insight, please enlighten me :) – Hayri Uğur Koltuk Feb 12 '13 at 10:48
  • 1
    @AliVeli But the registers can still use the same endian-ness as the processor. A shift left will always multiply by **2** because the ALU also uses the same endian-ness and is wired up in such a way that it can be understood properly. Operations don't magically happen, they still have to go through and ALU and thus through wires to the correct location. – slugonamission Feb 12 '13 at 10:50
  • I tend to disagree because http://stackoverflow.com/a/4505750/767543 and for example http://www.cs.umd.edu/class/sum2003/cmsc311/Notes/Data/endian.html (The register is neither big endian nor little endian. It's just a register holding a 32 bit value) – Hayri Uğur Koltuk Feb 12 '13 at 10:57
  • @AliVeli - It doesn't actually matter what endian-ness your registers are. As you say, they just hold a 32-bit value. Your processor's interface is still little endian though, so given that you have to load in 0x9 and 0xFF000000, both operands will have to be the correct endian-ness before being loaded into the black hole that is your processor. You don't need to care about this internally, just on the interface. Your registers are then wired to the ALU in the "correct" format for what the ALU expects. – slugonamission Feb 12 '13 at 11:03