2

I am new in C++ programming. I am trying to implement a code through which I can make a single integer value from 6 or more individual bytes.

I have Implemented same for 4 bytes and it's working

My Code for 4 bytes:

char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x  %x  %x  %x  %x",command[2], command[3], command[4], command[5], value);

Using this Code the value of value is 82a12122 but when I try to do for 6 byte then the result was is wrong.

Code for 6 Bytes:

char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[0] << 40) + ((unsigned char)command[1] << 32) + ((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x  %x  %x  %x  %x  %x  %x", command[0], command[1], command[2], command[3], command[4], command[5], value);

The output value of value is 82a163c2 which is wrong, I need 42a082a12122. So can anyone tell me how to get the expected output and what is wrong with the 6 Byte Code.

Thanks in Advance.

  • 2
    `(unsigned char)command[0] << 40`, your unsigned char, will be promote to `int` which has less byte than `int64_t` in your case. – Jarod42 Aug 07 '18 at 15:33
  • 1
    don't use `%x`, for chars use `%hhx`, for int64 `%llx` in printf – Iłya Bursov Aug 07 '18 at 15:35
  • You have to cast it to `_int64` before shifting. – Jarod42 Aug 07 '18 at 15:42
  • 2
    @IlyaBursov: Technically, if you want to be careful, [you'd use the `inttypes.h` macros like `PRIx64`](https://stackoverflow.com/a/6850853/364696) rather than guessing at the correspondence between relative width prefixes like `ll` and the fixed bit width types. Or since it's C++, you'd just use `iostream` with `std::cout` and avoid the issue entirely. – ShadowRanger Aug 07 '18 at 15:48
  • 1
    Is this actually C++? looks like low-level C to me. – Walter Aug 07 '18 at 15:51
  • You should declare `command` as `uint8_t const * const` because `char` may be signed or unsigned (or char) as defined by the compiler. – Thomas Matthews Aug 07 '18 at 16:53
  • @Walter: The OP tagged it C++, so I'm assuming they're *able* to use C++ features, even if the code doesn't use any as written. – ShadowRanger Aug 09 '18 at 14:55

2 Answers2

5

Just cast each byte to a sufficiently large unsigned type before shifting. Even after integral promotions (to unsigned int), the type is not large enough to shift by more than 32 bytes (in the usual case, which seems to apply to you).

See here for demonstration: https://godbolt.org/g/x855XH

unsigned long long large_ok(char x)
{
    return ((unsigned long long)x) << 63;
}

unsigned long long large_incorrect(char x)
{
    return ((unsigned long long)x) << 64;
}


unsigned long long still_ok(char x)
{
    return ((unsigned char)x) << 31;
}

unsigned long long incorrect(char x)
{
    return ((unsigned char)x) << 32;
}

In simpler terms:

The shift operators promote their operands to int/unsigned int automatically. This is why your four byte version works: unsigned int is large enough for all your shifts. However, (in your implementation, as in most common ones) it can only hold 32 bits, and the compiler will not automatically choose a 64 bit type if you shift by more than 32 bits (that would be impossible for the compiler to know).

If you use large enough integral types for the shift operands, the shift will have the larger type as the result and the shifts will do what you expect.

If you turn on warnings, your compiler will probably also complain to you that you are shifting by more bits than the type has and thus always getting zero (see demonstration).

(The bit counts mentioned are of course implementation defined.)


A final note: Types beginning with double underscores (__) or underscore + capital letter are reserved for the implementation - using them is not technically "safe". Modern C++ provides you with types such as uint64_t that should have the stated number of bits - use those instead.

Max Langhof
  • 23,383
  • 5
  • 39
  • 72
  • Still getting the same answer, My expression is value = (((unsigned long long)command[0] << 40) + ((unsigned char)command[1] << 32) + ((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]); – Chandrapal Singh Jhala Aug 07 '18 at 15:47
  • You are missing it for the 32 bit shift. That one is too large too. But really, you should do it for all of them for consistency. Also mind my edit. – Max Langhof Aug 07 '18 at 15:49
  • Can I cast all of them with unsigned long long sir? – Chandrapal Singh Jhala Aug 07 '18 at 15:52
  • Yes, or just use `uint64_t`. It's shorter to type/read and guaranteed to be 64 bits long, unlike `unsigned long long`, which is not _guaranteed_ to be (but usually is) 64 bits. – Max Langhof Aug 07 '18 at 15:52
  • Sir, I have heard that endianess is necessary when doing byte shifting can you please explain a bit about that. – Chandrapal Singh Jhala Aug 07 '18 at 16:04
  • @ChandrapalSinghJhala Sorry, that would be off-topic for this question. If, after investigating that problem yourself, you cannot find a way that works, you are welcome to ask another question. – Max Langhof Aug 07 '18 at 16:13
2

Your shift overflows bytes, and you are not printing the integers correctly.

This code is working: (Take note of the print format and how the shifts are done in uint64_t)

#include <stdio.h>
#include <cstdint>

int main()
{
    const unsigned char *command = (const unsigned char *)"\x42\xa0\x82\xa1\x21\x22";
    uint64_t value=0;
    for (int i=0; i<6; i++)
    {
        value <<= 8;
        value += command[i];
    }
    printf("%x  %x  %x  %x  %x  %x  %llx",
             command[0], command[1], command[2], command[3], command[4], command[5], value);
}
Max Langhof
  • 23,383
  • 5
  • 39
  • 72
SHR
  • 7,940
  • 9
  • 38
  • 57