5

I just want to concatenate my uint8_t array to uint64_t. In fact, I solved my problem but need to understand the reason. Here is my code;

    uint8_t byte_array[5];

    byte_array[0] = 0x41;
    byte_array[1] = 0x42;
    byte_array[2] = 0x43;
    byte_array[3] = 0x44;
    byte_array[4] = 0x45;

    cout << "index 0: " << byte_array[0] << " index 1: " << byte_array[1] << " index 2: " << byte_array[2] << " index 3: " << byte_array[3] << " index 4: " << byte_array[4] << endl;

    /* This does not work */
    uint64_t reverse_of_value = (byte_array[0] & 0xff) | ((byte_array[1] & 0xff) << 8) | ((byte_array[2] & 0xff) << 16) | ((byte_array[3] & 0xff) << 24) | ((byte_array[4] & 0xff) << 32);

    cout << reverse_of_value << endl;

    /* this works fine */
    reverse_of_value = (uint64_t)(byte_array[0] & 0xff) | ((uint64_t)(byte_array[1] & 0xff) << 8) | ((uint64_t)(byte_array[2] & 0xff) << 16) | ((uint64_t)(byte_array[3] & 0xff) << 24) | ((uint64_t)(byte_array[4] & 0xff) << 32);

    cout << reverse_of_value << endl;

The first output will be "44434245" and second one will be "4544434241" that is what I want.

So as we see when I use casting each byte to uint64_t code works, however, if I do not use casting it gives me irrelevant result. Can anybody explain the reason?

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
ergin
  • 51
  • 1
  • 3
  • 1
    It may be that when you left-shift before upcasting to 64 bits wide datatype, the bits just get lost or garbled. – Erik Alapää Aug 26 '15 at 14:08
  • 1
    Adding an answer with example program showing left-shifting without upcast turning the unsigned char to 0. – Erik Alapää Aug 26 '15 at 14:20
  • @ErikAlapää: No need; the question contains one. Also, that's not an "upcast"; if you want a specific term, it's a "widening conversion". – Lightness Races in Orbit Aug 26 '15 at 14:20
  • 2
    OK, correct, the term upcast is for casting derived-class references or pointers to base class. – Erik Alapää Aug 26 '15 at 14:24
  • 1
    @ErikAlapää: why does not my left-shift operand affect on 0x45? I expect to have '44434241' instead of '44434245'. – ergin Aug 26 '15 at 14:56
  • 2
    To be completely general, one must know the compiler and CPU architecture you are on. Endianness may affect the answer, and also how the individual CPU handles large shifts - things may differ between a 32-bit and a 64-bit CPU. Did you print in decimal or in hex? – Erik Alapää Aug 26 '15 at 15:03
  • 1
    @ergin: Actually, my gcc warns about the 32-bit shift being to big for the type. And probaby, the 0x45 shifted 32 bits wraps around to the least significant byte of the 64 bits, gets or:ed with 0x41, and you get 0x45 in the least significant byte. – Erik Alapää Aug 26 '15 at 15:16
  • @ErikAlapää: How does endianness affect the answer? – Lightness Races in Orbit Aug 26 '15 at 15:46
  • 2
    @LightnessRacesinOrbit: I meant in general. Bitwise shifts by themselves are executed in CPU registers and are endianness-independent. But when you combine with casts and memory references, endianness can matter. See e.g. the discussion in http://stackoverflow.com/questions/1041554/bitwise-operators-and-endianness – Erik Alapää Aug 26 '15 at 19:38
  • @ErikAlapää: No, the semantics of bitshifting in the C++ language is defined by the C++ language and is endian-agnostic. Of course if you hack away all the protections and start bypassing the type system, you can break that. But then you're not really using C++'s bitshifts on well-typed values, are you? You're doing something else. – Lightness Races in Orbit Aug 26 '15 at 19:43
  • 1
    @LightnessRacesinOrbit: I just said that bitshifting in itself is endian-agnostic. But combining with casts and memory references is not exotic, so it is important to be aware of what can happen. – Erik Alapää Aug 26 '15 at 19:52
  • @ErikAlapää: That's certainly true. :) – Lightness Races in Orbit Aug 26 '15 at 20:09
  • @ErikAlapää: I use CodeBlocks IDE with MinGW32-g++ compiler, my CPU is 64-bit and processor is little-endian. Btw, I printed the in decimal. – ergin Aug 27 '15 at 14:25
  • As an aside, I'm pretty sure the `& 0xff` is completely redundant, right? Because the value of the `uint8_t` is promoted to `int` before the `&`, as is the `0xff` literal, and those values' integer-promoted bitwise representations are what's and-ed - but there's no way `uint8_t` (integer-promoted or not) to have any other bits set than the ones masked by the value `0xff`. – mtraceur Apr 16 '16 at 19:05

3 Answers3

7

Left-shifting a uint8_t that many bits isn't necessarily going to work. The left-hand operand will be promoted to int, whose width you don't know. It could already be 64-bit, but it could be 32-bit or even 16-bit, in which case… where would the result go? There isn't enough room for it! It doesn't matter that your code later puts the result into a uint64_t: the expression is evaluated in isolation.

You've correctly fixed that in your second version, by converting to uint64_t before the left-shift takes place. In this situation, the expression will assuredly have the desired behaviour.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 2
    The more than 7 bits will go into the integer promoted result variable. A complete answer needs to address integer promotion. Suppose `int` is 64 bits? Then the code will work just fine. – Lundin Aug 26 '15 at 14:25
  • 1
    Not true. The left operand of a shift operator is promoted. The promoted type is at least size of `int`, which is at least 16 bit. So yes, left-shifting a `uint8_t` more than 7 bits (so long as it's less than `INT_BITS - 1` bits) does make sense. – ach Aug 26 '15 at 14:25
  • 2
    But it does not work. Because the actual implementation of this varies from compiler to compiler, and sometimes between versions. Trust me, I am in the middle of porting legacy 16-bit industrial control software, and these types of maths problems are overly annoying to track down and diagnose. – std''OrgnlDave Aug 26 '15 at 14:27
  • 1
    @LightnessRacesinOrbit, let's say the left-hand operand promoted to int which is 32bit and I try to use left operand by shifting 32bit. In my case, I tried to shift 0x45 << 32, so I expect that the result would be '0x00' if I do not use casting. Am I right? – ergin Aug 26 '15 at 14:51
  • 1
    @ergin: No, the behaviour is undefined. You could open up a black hole. – Lightness Races in Orbit Aug 26 '15 at 15:01
3

Here is an example showing left-shift turning the char to 0. At least it does so on my machine, gcc 4.8.4, Ubuntu 14.04 LTS, x86_64.

#include <iostream>

using std::cout;

int main()
{
    unsigned char ch;

    ch = 0xFF;

    cout << "Char before shift: " << static_cast<int>(ch) << '\n';
    ch <<= 10;

    cout << "Char after shift: " << static_cast<int>(ch) << '\n';
}

Note also my comment to the original question above, on some platforms, the 0x45 shifted 32 bits actually ends up in the least significant byte of the 64-bit value.

Erik Alapää
  • 2,585
  • 1
  • 14
  • 25
3

Shifting a type by more than the number of bits in the type is undefined behavior in C++. See this answer for more detail: https://stackoverflow.com/a/7401981/1689844

Community
  • 1
  • 1
statueuphemism
  • 644
  • 2
  • 5
  • 13