15

I need a portable way to print the value of a variable n of type size_t. Since I use ANSI C89 I cannot use the z length modifier. My current approach is to cast the value to long unsigned int:

printf("%lu\n", (long unsigned int) n);

Provided that size_t is defined as either unsigned int or long unsigned int I can't see how this would fail. Is the cast safe?

August Karlstrom
  • 10,773
  • 7
  • 38
  • 60
  • 1
    Safe as long as the size fits into an UL... – Déjà vu Nov 29 '17 at 15:23
  • http://port70.net/~nsz/c/c11/n1570.html#7.19p4 *The types used for size_t and ptrdiff_t should not have an integer conversion rank greater than that of signed long int unless the implementation supports objects large enough to make this necessary.*. Yet it is just a recomendation. – Eugene Sh. Nov 29 '17 at 15:26
  • 5
    `size_t` is an unsigned type, so does not have any 'trap' values. So it is always safe to convert it to `long unsigned int` even when the original value cannot be represented in the new type. – Ian Abbott Nov 29 '17 at 15:27
  • 1
    @Bilkokuya Yes, I meant 'safe' as in no UB. – Ian Abbott Nov 29 '17 at 15:37
  • 3
    @IanAbbott Apologies for that, I've removed my incorrect comment. Only just learned that `size_t` cannot ever be larger than `long unsigned int` in C89 due to the following: https://stackoverflow.com/a/39441237/955340 It appears it is safe in both regards, again - apologies. –  Nov 29 '17 at 15:39

2 Answers2

17

In C89, size_t is defined as an unsigned integer type. Unlike future standards, C89 defines what the list of unsigned integer types are as the following:

  • unsigned char
  • unsigned short
  • unsigned int
  • unsigned long

As such, size_t in C89 will never be larger than unsigned long, and therefore the cast is always safe - both in that it will not cause any undefined behaviour and in that it will always be large enough to hold the value in entirety.

Worth noting; the C89 standard states: "A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program" Meaning that no extension could change this behaviour - while still conforming to the C89 standard, as the unsigned integer types have been specifically listed and therefore cannot be altered.

In future standards, this is not a guarantee and while you will not get undefined behaviour - you may lose data where unsigned long is smaller than size_t, meaning that you would display incorrect data to your user. In this situation I'd be hesitant to label it as "safe".


As an important additional note; this answer refers to compilers that are compliant with the C89 standard. It is possible for your C89 compiler to be "less than compliant" in the respects above, in which case - treat the behaviour to be similar to that of C99 or newer where you will not see undefined behaviour, but may suffer data loss if size_t is larger than unsigned long. To be clear though, this would not be complying with the C89 standard.

Beyond this, while my interpretation of the standard (1.7 Compliance) is that while it states extensions must not alter the behaviour of a "strictly conforming program" and as such cannot alter the fact that size_t must be unsigned long at largest without complying; it does not change the fact that such extensions do exist. For example GNU GCC does provide an extension that adds unsigned long long. In my view this is non-compliant, but the reality is you must be prepared to deal with such things and as such - while the standard says what you are doing is completely safe, you must be prepared for potential data loss where non-compliant compilers or extensions are used.


Please see here for previous discussion on this topic: https://stackoverflow.com/a/39441237/955340

  • 2
    "As such, size_t in C89 will never be larger than unsigned long, and therefore the cast is always safe" isn't true. There's no such guarantee in the standard. Having `size_t` as a 64-bit type and `unsigned long` as a 32-bit type is a valid in C89. Having said that, there's not much better option if one is stuck with C89. – P.P Nov 29 '17 at 15:58
  • P.P. Do you have an example of such a system or compiler? – August Karlstrom Nov 29 '17 at 16:01
  • Linked page in the answer has MSVC as an example. – P.P Nov 29 '17 at 16:04
  • 2
    That's based on the assumption that `sizeof(long) <= sizeof(size_t)` (i.e., if `size_t` must be one of an existing "integral type". But I can't find anything to that effect) is required in C89 which I am not sure about. Perhaps MSVC isn't a valid C89 implementation in the respect ;-) – P.P Nov 29 '17 at 16:19
  • 1
    @P.P. I have re-read the relevant parts of the draft standard for C89, and while I may be wrong - my interpretation is definitely that the MSVC example shows non-conforming behaviour. I will amend my answer to include this though, to ensure it's clear that relying on the letter of the standard may not produce results in reality. The draft standard I am using is: http://port70.net/~nsz/c/c89/c89-draft.html The specific line I'm going by is (3.3.3.4) "its type (an unsigned integral type) is size_t", combined with the list of type specifiers in (3.5.2) –  Nov 29 '17 at 16:22
  • 2
    `size_t` an integral type and integral type is defined as "The type char, the signed and unsigned integer types, and the enumerated types are collectively called integral types." So if an `size_t` is defined as `unsigned long long`, say an extension provided by the implementation in C89, then it's an "unsigned integer type" and can be considered as an integral type. So it's not clear if `size_t` must be an *existing* integral type. – P.P Nov 29 '17 at 16:36
  • 1
    @P.P. My interpretation of the standard (1.7 Compliance) *"A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program"* is that an extension which adds `unsigned long long` and allows `size_t` to be defined as it, would break previously *"strictly conforming"* code and therefore would not be compliant. However, I think the important part which I definitely agree on - is the reality will be that because such extensions do exist, relying on the "word of the standard" is not truly "safe". –  Nov 29 '17 at 16:46
  • 2
    The reasoning in this answer is not correct. It's possible for `size_t` to be an extended integer type (in which case there might be no way to print it with `printf` in C89). It's also possible that you're writing C89 code simply for cross-platform compatibility, but using it with a C99 or C11 compiler where larger standard types may be available. In C99 or later you can safely use a cast to `(uintmax_t)` and `%ju` but C89 did not have that. – R.. GitHub STOP HELPING ICE Nov 29 '17 at 16:55
  • 1
    @R.. In both of these cases, does this not mean that either the extension is technically non-compliant - or that you are not truly compiling C89 code with a compliant C89 compiler? I've added a proviso at the bottom to try to make clear that this is the reality of the situation and not to rely on the main answer for practical situations. But I disagree that the answer incorrect in regard to writing a C89 program that follows the letter of the standard. If I'm mistaken, I'm happy to adapt the answer further. –  Nov 29 '17 at 17:00
  • 2
    @Bilkokuya: As far as I know, C89 allows implementations where `size_t` is an extended integer type. A C89 that accepts `long long` as an extension and uses `unsigned long long` to define `size_t` should be an example of such. It's true that a C99 implementation is not, strictly speaking, a conforming C89 implementation (due to some namespace considerations and subtle behavioral differences) but from a practical standpoint it's reasonable to write C89 code that's also valid C99 code and compile on a C99 implementation. – R.. GitHub STOP HELPING ICE Nov 29 '17 at 17:12
  • 1
    @R.. I definitely agree on the practical standpoint, and that holding the standard as a golden-ruling is likely to produce unsafe or buggy code in this instance. I feel what's probably most useful to people is that this discussion is visible, along with the bold warning in the answer - so for now I'm going to avoid editing the answer further. Hopefully it is now clear to anybody that comes here in future that the reality of the situation (regardless of what the standard dictates) is that you may cause issues if you convert `size_t` to `unsigned long` in C89 - but no UB. –  Nov 29 '17 at 17:18
3
size_t n = foo();
printf("%lu\n", (long unsigned int) n);

Provided that size_t is defined as either unsigned int or long unsigned int ... Is the cast safe?

Yes, the cast is safe with no undefined behavior nor loss of information on C89, C99, C11.


But what happens if that proviso is not true?

Assuming the range of size_t will be within the range of unsigned long is very reasonable. Add a compile time test: ref

#include <limits.h>
#if defined(__STDC__)
#if defined(__STDC_VERSION__)
#if (__STDC_VERSION__ >= 199901L)
#include <stdint.h>
#if SIZE_MAX > ULONG_MAX
#error Re-work printf size code
#endif
#endif
#endif
#endif

The point is that when code had a dependency - add a test. Even if it acceptable on all known machines today and historically, the future has unknowns.

C, today, with its immense flexibility does allow SIZE_MAX > ULONG_MAX, but it is certainly rare. IMO, SIZE_MAX > ULONG_MAX is beyond the pale.


Such tests are common as from time to time, though possible, it is simply not practicable or budgeted to write super portable code.

#include <limits.h>
#if CHAR_BIT != 8 && CHAR_BIT != 16 && CHAR_BIT != 32 && CHAR_BIT != 64
  #error Code depends on char size as a common power of 2.
#endif

I need a portable way to print the value of a variable n of type size_t.

Yet to address OP's top level goal, a simple portable helper function can be written.

// This approach works with any unsigned type
void print_size_t(size_t n) {
  if (n >= 10) print_size_t(n/10);
  putchar((int) (n%10) + '0');
}

To avoid recursion, a slightly longer function:

#include <limits.h>
void print_size_t(size_t n) {
  char buf[sizeof n * CHAR_BIT / 3 + 2];  // 1/3 is more than log2(10)
  char *p = &buf[sizeof buf - 1];          // Start at end of buf[]
  *p = '\0';
  do {
    p--;
    *p = (char) (n%10 + '0');
    n /= 10;
  } while (n);    // Use a do {} while so print_size_t(0) prints something
  fputs(p, stdout);
}
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256