0

Let's say I have the following code lines:

int a; // 4-byte-integer
char b, c, d, e;

b = (char)(a >> 24);
c = (char)(a >> 16);
d = (char)(a >> 8);
e = (char)a;

Let's also assume that the system is storing the bytes in little-endian mode and a = 100.

When using the explicit cast like that, do the left-most bytes disappear? I guess that after executing the above lines, the variables will hold these values: b=100, c=0, d=0, e=0. Is it right?

Polb
  • 640
  • 2
  • 8
  • 21
  • 1
    Unless you use `unsigned` variables this will cause `undefined behaviour`. Then the most significant part will be truncated. The endianness is not relevant in this example. – Weather Vane Dec 27 '15 at 20:30
  • 3
    Dup of http://stackoverflow.com/questions/6752567/casting-a-large-number-type-to-a-smaller-type? – Turn Dec 27 '15 at 20:32
  • @WeatherVane is it undefined behavior or unexpected behavior? The behavior of the sign bit is deterministic. – nicomp Dec 27 '15 at 20:38
  • @nicomp As long as the behaviour is unexpected, why shouldn't be undefined? – Michi Dec 27 '15 at 20:47
  • 1
    Right-shifting a negative signed integer is _implementation-defined_, not _undefined_ behaviour. The standard says (§6.5.7): _The result of `E1 >> E2` is `E1` right-shifted `E2` bit positions. If `E1` has an unsigned type or if `E1` has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of `E1 / 2 ** E2`. If `E1` has a signed type and a negative value, the resulting value is implementation-defined._ (I used `2 ** E2` for 2 raised to the power of E2 since `2E2` doesn't work in comments, even outside code quotes (2E2 — see!). – Jonathan Leffler Dec 27 '15 at 20:48
  • @nicomp: right shifting a signed value is *implementation defined*, but storing a value outside its boundaries into a signed type is *undefined behavior*. Sad and quite unexpected, but true, and far reaching consequences with modern optimizing compilers. – chqrlie Dec 27 '15 at 20:48

1 Answers1

2

You guess right! But your explanation is not completely correct:

  • The behavior of the above code does not depend on the endianness of the system: if int is 32 bits and char 8 bits, a >> 24 is the high order byte and a & 255 the low order byte, for all possible endianness possibilities.

  • explicit casts as (char) are not needed, because C does implicit conversion of the expression value to the type of the assignment destination. I suppose the programmer wrote it this way to silence a compiler warning. Microsoft compilers are notoriously vocal about losing precision in assignments.

  • the leftmost bytes do not disappear, the value is computed modulo the size of char, hopefully 8 bits in your case, so (char)a is essentially the same as a & 255. But if char is signed, this behavior is not actually well defined by the Standard if the value exceeds CHAR_MAX. It is wise to use unsigned types for this kind of bit manipulation.

chqrlie
  • 131,814
  • 10
  • 121
  • 189