1

I'm trying to break a 2-byte number into two 1 byte numbers. But I get wrong result. Assumed number is: 0x1234H

uint8_t high = 0;
uint8_t low = 0;

high = static_cast<uint8_t >(val & 0xFF);
low = static_cast<uint8_t >(val >> 8);

cout << std::bitset<8>(high) << endl;
cout << std::bitset<8>(low) << endl;

cout << "high byte: " << static_cast<int >(high) << endl;
cout << "low byte: " << static_cast<int >(low) << endl;

When I run the code I expect to get the following output:

0x1234
00001100
00010010
high byte: 12
low byte: 34

Yet instead I get,

0x1234
00110100
00010010
high byte: 34
low byte: 12

Why do I fail in my attempt?

Jesper Juhl
  • 30,449
  • 3
  • 47
  • 70
  • 3
    `val & 0xFF` gives you the low byte, not the high one – Amadeus Feb 25 '19 at 19:51
  • `val >> 8` gives you the high byte (iff `val` is a 16bit integer, otherwise you may need to mask off more high bits). – Jesper Juhl Feb 25 '19 at 19:54
  • In English, when a number is cout'd (displayed or print'd) to a user, the least significant byte is always at the right end of the text-representing-the-number (regardless of the host endian-ness). Thus, 0x1234 is the text of the number, and 34 are the least significant 8 bits. – 2785528 Feb 25 '19 at 20:03
  • @2785528 that's why I expect to get 34 in low variable. – Soner from The Ottoman Empire Feb 25 '19 at 20:09
  • 1
    please show a complete example, consfusing hex/decimal literals and output is rather common, so even if you got it right, adding that missing pieces makes the question more clear – 463035818_is_not_an_ai Feb 25 '19 at 20:09
  • @2785528 or you could just say that when printing or operating on a number using bitwise operators, you are always working with a Big Endian representation of the number. – Jesper Juhl Feb 25 '19 at 20:11
  • 1
    ...eg printing `0x34` will not result in `34` on your screen (unless you use the right io-manipulator) – 463035818_is_not_an_ai Feb 25 '19 at 20:15

1 Answers1

3

That's because you've named the variables wrong on these lines...

high = static_cast<uint8_t >(val & 0xFF);
low = static_cast<uint8_t >(val >> 8);

The >> operator is shifting bits downward from high bit positions to low bit positions. If you have to shift those bits down to preserve them (in the cast) that's because they weren't originally the low bits. So...

low = static_cast<uint8_t >(val & 0xFF);
high = static_cast<uint8_t >(val >> 8);

BTW - the bitwise and operator is redundant when you're casting to uint8_t anyway - that's already enough to discard all but the low byte. It's still correct, just not necessary.

  • the bitwise and operator is redundant when you're casting to uint8_t anyway -- is it always right irrespective of endianness of a machine? – Soner from The Ottoman Empire Feb 25 '19 at 20:18
  • 1
    @snr - yes, for `static_cast` endian issues don't exist. The way you'd be affected by endianness is if you did a `reinterpret_cast` from a large integer to an array of smaller integers, because endianness affects which bit of the larger integer relates to which smaller one in the memory layout, but `static_cast` deals with the values, not the layout. Results are still platform-defined, but only to the extent that the number of bits in particular integer types can vary a bit - a much stricter restriction than applies for `reinterpret_cast` (or for similar tricks involving `union`). –  Feb 25 '19 at 20:26
  • @snr - for bitwise operations and `static_cast`, the low bits are always the least significant bits, and the way they're arranged if you store them in memory is a different issue. There's advice for dealing with binary file formats that's based on this - you worry about the endian in the file format and extract-and-write/read-and-reassemble based on that without ever caring about the endian convention used by the platform your code is running on. The code is then more portable than tricks using `union` or `reinterpret_cast`, where you have *two* endian conventions to worry about. –  Feb 25 '19 at 20:37