Why does the first one take fewer bytes than second?
While Peter Cordes' answer is already about the technical details, I'd like to focus on the mathematical background:
x86s CPU obviously does not distinguish between large numbers (like 12345789) and the value zero: For storing such a value 4 bytes are required.
However, the value zero is a very special value:
It can be written as (a-a) or as (a XOR a) while "a" can be any integer value!
This means that you can perform a trick:
You perform the operation subq %rcx, %rcx
to calculate the value (rcx - rcx)
. It does not care which value rcx
has: If you subtract that value from itself, the result will be zero (because (a-a)=0).
This means that rcx
will be 0 after that operation.
The operation xorq %rcx, %rcx
has the same effect, because (a XOR a) is also always 0.