7

My questions stem from trying to use printf to log things when trying to build for multiple bit-depth platforms (32/64 for example).

A problem that keeps rearing its ugly head is trying to print ints on multiple architectures. On 32 bit it would be something like

printf(" my int: %d\n", myInt);

but on 64 bit, it would have to be changed to

print (" my int: %ld\n", (long)myInt);

I have two related questions:

  1. My first thought was that when you tell printf to print a variable, giving it a format, it would look at the address of that variable and grab as many bytes as it needed for that format. This seemed like a big problem at first. For example if you had a variable myChar that was a char (1 byte), but used a format specifier of %d, that would tell printf to go to the address of myChar and grab the next 4 bytes to treat it like an int. If this were the case, it seems like printf would grab garbage date from neighboring variables (because it was grabbing 4 bytes, but the real value is only 1 byte). This appears to not be the case however. By using myChar and specifying %d, printf grabs 1 byte and then pads the upper 3 bytes with 0's. Is my understanding correct here?

  2. If the above is true, is there any real harm in always promoting variables up to their largest values to avoid the types of problems seen in the 32/64 bit case. For example if you have a short variable myShort, and an int variable, myInt, is there any downside in printing them always as:

    printf("myShort %ld", (long)myShort); printf("myInt %ld", (long)myInt);

Thanks for any clarification.

deleted_user
  • 3,817
  • 1
  • 18
  • 27
D.C.
  • 15,340
  • 19
  • 71
  • 102
  • possible duplicate of [Automatic Type promotion in variadic function](http://stackoverflow.com/questions/7084857/automatic-type-promotion-in-variadic-function) –  Oct 21 '12 at 07:23
  • 1
    IMO, not really that close to the dup question. I would encourage people to keep the discussion going here instead of voting to close. – D.C. Oct 21 '12 at 07:36
  • 3
    Why do you believe that `printf("%d", myInt);` has to be changed for 64-bit platforms? – CB Bailey Oct 21 '12 at 08:08

4 Answers4

6

Regarding printf: In the case you sited, "%d" must, by specification, handle the platform-defined 'int' data type. It doesn't matter whether it is 32bits, 64bits, or 128bit linear AS/400 value. If you want to promote the value to a larger field type (and match that promotion with the related format string particle) you're certainly free to do so,

int a=0;
printf("%ld", (long)a);

is certainly defined behavior using promotion.

I think the real crux of your question comes in cases like the following, and whether forcing promotion can "solve" any problems that arise. For example:

char ch = 'a';
printf("%d", ch);

or what about:

char ch = 'a';
printf("%ld", (long)ch);

or maybe this (which is the real condition you seem to be trying to avoid):

char ch = 'a';
printf("%ld", ch);

The first of these will work, but only because the minimum size of anything stack-pushed on a va-arg list is the platform-size of an int. The compiler will auto-promote the value to an int for you. Since "%d" expects a platform int all will appear well.

The second will work always and is fully supported. There is a clear and defined promotion from char to long. Even if long is 64bits (or larger) it will still work.

The third is UB all the way. printf is looking for a long and will be presented with only bytes for an int. If this seems to "work" on your platform, then check your platform width for int and long. It is likely "working" only because your platform long and int are the same bit-width. It makes for fun surprises when porting the code to platforms where they are not, and since it is pushed through a va-arg, you won't see it until real different widths come in to play.

All of that being said, now throw an actual address to something (anything, really) such as that required by scanf and we're looking at something entirely different.

int val;
sscanf("%ld",&val);

This is a seg-fault waiting to happen. Just like above, you'll never know it if your platform long and platform int are the same width. Take this code to a box where long and int are different sizes and prep yourself for a gdb-load of the ensuing core file.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
1

You said:

A problem that keeps rearing its ugly head is trying to print ints on multiple architectures

Is it dangerous to try and get in front of type issues by passing in values that are not of that type size, yes. Thats why the compiler warns you. The notion of portability which seems to be causing you problems is not designed to make printf happy.

Its designed to make your program run and not crash on multiple architectures. If you have platform specific code you should use #ifdef macros to work around it.

Otherwise you are rolling the dice trying to layer over memory level type conversion.

printf is a convenience not a type conversion methodology.

It seems you are focused on ints - which you will probably get away with. But in general I would not rely on a technique like this.

deleted_user
  • 3,817
  • 1
  • 18
  • 27
0

bools/_Bools, chars and shorts are first converted into int (if this conversion preserves the value, else into unsigned int) when passed to variadic functions like printf(). Similarly floats get converted into doubles.

So, if you pass something smaller than int, printf() will grab the whole (unsigned) int without any problems (other than if the passed value is actually an unsigned int and you're printing it with %d instead of %u, you get undefined behavior).

Other types, AFAIR, do not undergo such conversions.

This line:

print (" my int: %ld\n", (long)myInt);

isn't buying you anything over this line:

printf(" my int: %d\n", myInt);

Both are valid and the result will be practically identical. The only difference is that the former might result in bigger code and longer execution time (if sizeof(long) >= sizeof(int)).

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
0
  1. The arguments are passed in the stack, which has a fixed width (32 or 64) bits per entry. The compiler 'casts' the integers, chars, shorts to the native width of the architecture or in the case of a double (or long long) at 32 architecture, it allocates two slots from the stack. The "padding" is done either with zeroes, or the sign bit of the variable is copied to the remaining bits. (called sign bit extension)

  2. One downside in promoting to 64 bits is the lack of compatibility in embedded systems, which do not often provide 64-bit printing. Also it means in 32-bit system some performance penalty as the top 32-bits are always passed and converted (there's a 64 bit wide division by 10 involved) without any real use. The bigger problem however, falls in to the domain of software engineering: does a "future compatible" log give a false hope that also all computation and all the input to the system works in 64-bit mode on 32-bit systems.

(long) in 32-bit architectures doesn't mean 64 bit. That is notated with (long long).

Aki Suihkonen
  • 19,144
  • 1
  • 36
  • 57