The behaviors you observe are the result of printf
interpreting the bits given to it as the type specified by the format specifier. In particular, at least for your system:
- The bits for an
int
argument and an unsigned
argument in the same position within the argument list would be passed in the same place, so when you give printf
one and tell it to format the other, it uses the bits you give it as if they were the bits of the other.
- The bits for an
int
argument and a double
argument would be passed in different places—possibly a general register for the int
argument and a special floating-point register for the double
argument, so when you give printf
one and tell it to format the other, it does not get the bits for the double
to use for the int
; it gets completely unrelated bits that were left lying around by previous operations.
Whenever a function is called, values for its arguments must be placed in certain places. These places vary according to the software and hardware used, and they vary by the type and number of arguments. However, for any particular argument type, argument position, and specific software and hardware used, there is a specific place (or combination of places) where the bits of that argument should be stored to be passed to the function. The rules for this are part of the Application Binary Interface (ABI) for the software and hardware being used.
First, let us neglect any compiler optimization or transformation and examine what happens when the compiler implements a function call in source code directly as a function call in assembly language. The compiler will take the arguments you provide for printf
and write them to the places designated for those types of arguments. When printf
executes, it examines the format string. When it sees a format specifier, it figures out what type of argument it should have, and it looks for the value of that argument in the place for that type of argument.
Now, there are two things that can happen. Say you passed an unsigned
but used a format specifier for int
, like %d
. In every ABI I have seen, an unsigned
and an int
argument (in the same position within the list of arguments) are passed in the same place. So, when printf
looks for the bits for the int
it is expected, it will get the bits for the unsigned
you passed.
Then printf
will interpret those bits as if they encoded the value for an int
, and it will print the results. In other words, the bits of your unsigned
value are reinterpreted as the bits of an int
.1
This explains why you see “-12” when you pass the unsigned
value 4,294,967,284 to printf
to be formatted with %d
. When the bits 11111111111111111111111111110100 are interpreted as an unsigned
, they represent the value 4,294,967,284. When they are interpreted as an int
, they represent the value −12 on your system. (This encoding system is called two’s complement. Other encoding systems include one’s complement and sign-and-magnitude, in which these bits would represent −1 and −2,147,483,636, respectively. Those systems are rare for plain integer types these days.)
That is the first of two things that can happen, and it is common when you pass the wrong type but it is similar to the correct type in size and nature—it is passed in the same place as the wrong type. The second thing that can happen is that the argument you pass is passed in a different place than the argument that is expected. For example, if you pass a double
as an argument, it is, in many systems, placed in separate set of registers for floating-point values. When printf
goes looking for an int
argument for %d
, it will not find the bits of your double
at all. Instead, what it finds in the place where it looks for an int
argument might be whatever bits happened to be left in a register or memory location from previous operations, or it might be the bits of the next argument in the list of arguments. In any case, this means that the value printf
prints for the %d
will have nothing to do with the double
value you passed, because the bits of the double
are not involved in any way—a complete different set of bits is used.
This is also part of the reason the C standard says it does not define the behavior when the wrong argument type is passed for a printf
conversion. Once you have messed up the argument list by passing double
where an int
should have been, all the following arguments may be in the wrong places too. They might be in different registers from where they are expected, or they might be in different stack locations from where they are expected. printf
has no way to recover from this mistake.
As stated, all of the above neglects compiler optimization. The rules of C arose out of various needs, such as accommodating the problems above and making C portable to a variety of systems. However, once those rules are written, compilers can take advantage of them to allow optimization. The C standard permits a compiler to make any transformation of a program as long as the changed program has the same behavior as the original program under the rules of the C standard. This permission allows compilers to speed up programs tremendously in some circumstances. But a consequence is that, if your program has behavior not defined by the C standard (and not defined by any other rules the compiler follows), it is allowed to transform your program into anything. Over the years, compilers have grown increasingly aggressive about their optimizations, and they continue to grow. This means, aside from the simple behaviors described above, when you pass incorrect arguments to printf
, the compiler is allowed to produce completely different results. Therefore, although you may commonly see the behaviors I describe above, you may not rely on them.
Footnote
1 Note that this is not a conversion. A conversion is an operation whose input is one type and whose output is another type but has the same value (or as nearly the same as is possible, in some sense, as when we convert a double
3.5 to an int
3). In some cases, a conversion does not require any change to the bits—an unsigned
3 and an int
3 use the same bits to represent 3, so the conversion does not change the bits, and the result is the same as a reinterpretation. But they are conceptually different.