2

I have an std::string that contains the response of a server. After parsing the string a bit, I come across a short. The short is big-endian and is stored in the string accordingly:

raw[0] == 0xa5;
raw[1] == 0x69;

I know this as

file << raw[0] << std::endl << raw[1];

when viewed as hex results to "0xa5 0x0a 0x69".

I write them to a short like this and then write to a file as inspired by https://stackoverflow.com/a/300837/1318909:

short x = (raw[1] << 8) | raw[0];
file << std::to_string(x);

expected result? 27045 (0x69a5).

actual result? -91 (0xffa5) <-- overflow!.

Why is this? I tested and it worked fine with a value of 2402. I also did some additional testing and it works with

short x = (raw[1] << 8) | 0xa5;

but not with

short x = (0x69 << 8) | raw[0];
Community
  • 1
  • 1
viderizer
  • 1,125
  • 2
  • 7
  • 7

2 Answers2

3

char may (or may not) be signed, and it looks like it is on your platform. That means that 0xa5 is sign extended to 0xffa5 before you or in the upper byte.

Try casting each byte to unsigned char before doing your bit manipulation.

Alan Stokes
  • 18,815
  • 3
  • 45
  • 64
  • This was the correct answer. unsigned and signed values are stored the same so I expected it to work. Is there a place I can read more about sign extending? – viderizer Jul 07 '14 at 21:47
  • 1
    @Viderizer well there are a lot of topics, it is not really sign extension per se. The topics include [char can either be signed or unsigned](http://stackoverflow.com/questions/4337217/difference-between-signed-unsigned-char), [integer promotion](http://stackoverflow.com/questions/24371868/why-must-a-short-be-converted-to-an-int-before-arithmetic-operations-in-c-and-c) and [usual arithmetic conversions](http://stackoverflow.com/questions/22047158/why-is-this-happening-with-the-sizeof-operator). There are many threads on each of these. – Shafik Yaghmour Jul 08 '14 at 00:53
  • it can take a while to fully get where each one applies, I am not even sure where the canonical threads for each of these is but I just linked to once I am familiar with. Ultimately you need to at least be passingly familiar with the standards, none of the threads can really cover all the nuances. – Shafik Yaghmour Jul 08 '14 at 00:54
1

raw[0] is an object of type char, which happens to be 8-bit signed integer type on your platform. When you stuff 0xA5 value into such signed char object, it actually acquires value of -91. When raw[0] is used as an operand of | operator, it is subjected to usual arithmetic conversions. The latter convert it to value of type int with value -91 (0xFFA5). That FF in the higher byte if what causes the observed result.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765