This works based on something called Two's Complement. The idea here, is that given some binary number, it's two complement will be it's One Complement (flip all bits) plus one. We can see a simple example, let's find the two's complement of 13
, which we can write as 0b01101
. 01101 (flip) -> 10010 (+1) --> 10011
Now, although if we interpreted that as a binary number as usual we would read 19
in decimal, we must know that the number is written in Two's Complement in order to reverse the procedure and arrive at the previous number 13
. So, from this we can see that we have represented things such that +13 = 01101
and -13 = 10011
, notice that the positive number starts with a 0
and it's symmetric with a 1
. This will be a constant when using this representation, positive numbers will always begin with a 0
, and negative ones with a 1
. Something else that is worth noting is that I prefixed a 0
to my original representation of 13
, that will be needed in order to correctly represent it's two's complement. You can try going through the same example without doing that and verifying it's necessity.
Now, let's take a look at a few values represented like this,
╔══════╦════════════════╦════════════════════════╗
║ Bits ║ Unsigned Value ║ Two's Complement Value ║
╠══════╬════════════════╬════════════════════════╣
║ 011 ║ 3 ║ 3 ║
╠══════╬════════════════╬════════════════════════╣
║ 010 ║ 2 ║ 2 ║
╠══════╬════════════════╬════════════════════════╣
║ 001 ║ 1 ║ 1 ║
╠══════╬════════════════╬════════════════════════╣
║ 000 ║ 0 ║ 0 ║
╠══════╬════════════════╬════════════════════════╣
║ 111 ║ 7 ║ -1 ║
╠══════╬════════════════╬════════════════════════╣
║ 110 ║ 6 ║ -2 ║
╠══════╬════════════════╬════════════════════════╣
║ 101 ║ 5 ║ -3 ║
╠══════╬════════════════╬════════════════════════╣
║ 100 ║ 4 ║ -4 ║
╚══════╩════════════════╩════════════════════════╝
As you can see, it works just as we had previously intended, however you can now begin to understand how the "bug" you found happens. The upper limit for a 4-bit representation in Two's Complement is the decimal value 3
. Let's see how we would reach -4
by simply adding 1
to it. 3 = 0b011
therefore 3+1 = 0b100
, which as you can see from the table maps to -4
(as opposed to 4
) on Two's Complement. Your case was this exact problem, but with more bits. Signed representation like this is circular, so overflowing on the top yields the bottom value. Let's look at your case
127 = 0b01111111
127 + 1 = 0b10000000
As you can see it starts with a 1
, therefore it is negative (!) and if you solve the Two's Complement you will see it represents -128 (as the lower bound is always larger than the upper bound).
It's given that not every hardware will implement things in the same way, Intel, AMD, ARM and, as far as I know, all major architectures for general purpose CPUs use Two's complement in their ALUs, but there is hardware that uses other techniques for implementing signing of integers, so fundamentally the behavior you described is undefined. Another interesting thing to notice is that IEEE's standard for floating point arithmetic, implements an exponent bias based signed float.
Finally, since we are talking about C here, do note that undefined behavior can be optimized by the compiler, one great example of such optimizations is described in this blog post.