4

When looking for evidence of unsigned long being enough to hold size_t for the purpose of being argument to printf I ran into two fact(oid)s.

First there's this answer stating that long is indeed not guaranteed to be large enough for size_t. On the other hand I saw this answer suggesting to use printf("%lu", (unsigned long)x) in pre C99, x being of size_t.

So the question is could you assume or were you guaranteed that long were enough to hold size_t in pre C99. The other question is whether there exists any guarantee that size_t would fit in any of the other standardized integer types (except the obvious exceptions like ssize_t, ptrdiff_t and such).

Community
  • 1
  • 1
skyking
  • 13,817
  • 1
  • 35
  • 57

2 Answers2

7

There is no such guarantee.

While it is common for implementation to have same size for long and size_t, it is not always the case. As put in the comments Windows 64-bit have different size for long and size_t.

Also notice that the minimum value of SIZE_MAX for an implementation is 65535 while the minimum value of ULONG_MAX is 4294967295 (2147483647 for LONG_MAX). (Note that SIZE_MAX appeared with C99.) It means that size_t is guaranteed to be at least 16-bit but unsigned long / long are guaranteed to be at least 32-bit.

EDIT: Question has changed a little bit after this answer... So:

So the question is could you assume or were you guaranteed that long were enough to hold size_t in pre C99.

There is no such guarantee even in C89. long can be 32-bit and size_t 64-bit. (See C89 example with MSVC in Windows 64-bit above.)

The other question is whether there exists any guarantee that size_t would fit in any of the other standardized integer types (except the obvious exceptions like ssize_t, ptrdiff_t and such).

Again there is no such guarantee by the Standard. size_t is an alias for another standard unsigned integer type (and it cannot be an extended integer type as C89 does not have extended integer types).

ouah
  • 142,963
  • 15
  • 272
  • 331
  • Also case in point, 64 bit windows have 32 bit longs and 64 bit size_t – nos Aug 19 '15 at 08:49
  • Also: `size_t` is unsigned, `long`s are signed. – Kninnug Aug 19 '15 at 08:50
  • In the case of Windows and Visual Studio, the format specifier for printing a size_t variable is (% upper case I) %I optionally followed by o, u, x, or X, which will work in both 32 bit and 64 bit modes. For GCC I think it's %zu (not sure if other suffixes can be used with %z). – rcgldr Aug 19 '15 at 09:06
  • Unfortunately this answer seem to be prematurely posted (as the question was), the question was later completed so this doesn't really answer the questions. – skyking Aug 19 '15 at 10:03
  • @skyking - In the case of Visual Studio (2013 and older, maybe 2015 as well), it's C (not C++) compiler is pre C99. – rcgldr Aug 19 '15 at 10:39
  • Is there even a guarantee that `SIZE_MAX+1` won't yield Undefined Behavior [e.g. if `int` and `long` have 32 value bits and a sign bit, while `unsigned int`, `unsigned long`, and `size_t` have 32 value bits and a padding bit, I think `SIZE_MAX+1` would promote `SIZE_MAX` to `long` (with value LONG_MAX), so adding 1 would cause overflow.] – supercat Aug 19 '15 at 18:03
  • In C89/C90, since there were no extended integer types and `size_t` must be one of the standard unsigned types, it was (I think) guaranteed that any `size_t` value is within the range of `unsigned long`. That guarantee evaporated in C99 with the introduction of `unsigned long long`. – Keith Thompson Sep 11 '16 at 22:06
  • @KeithThompson: Nothing in C89 would have forbade an implementation from defining a __uint64 or, for that matter, __uint24 type. If the underbars were included, such a thing wouldn't even be an "extension" as C89 defines the term (note that annex J2 of C89 lists predefinition of certain macros *without underbars* as a common extension, implying both that the existence of additional predefined identifiers without underbars would be a legitimate extension, and that the existence of identifiers without underbars need not be considered an extension at all. – supercat Sep 16 '16 at 17:15
  • @supercat: ISO C90 has no Annex J2. I don't have a copy of ANSI C89. ISO C90 lists "Common extensions" in G.5 -- and it clearly says, "The inclusion of any extension that may cause a strictly conforming program to become invalid renders an implementation nonconforming.". Such an "extension" is not a permitted extension as described in C90 section 4, "Compliance". C90 requires` `size_t` to be an "unsigned integral type". C90 6.1.2.5 defines what *signed integer types* and *unsigned integer types* are, by listing them. – Keith Thompson Sep 16 '16 at 17:31
  • @supercat: In any case, C90 is an obsolete standard. We needn't worry about new implementations inventing new extensions. If I'm writing code that must work with a conforming C89/C90 implementation, it's perfectly safe to assume that any `size_t` value is within the range of `unsigned long`, both by the letter of the standard and the practical consideration of what any real-world compilers actually do. – Keith Thompson Sep 16 '16 at 17:32
3

So the question is could you assume or were you guaranteed that long were enough to hold size_t in pre C99.

Not long, but unsigned long.

In C89/C90, size_t is required to be an unsigned integral type. There are exactly 4 unsigned integer types in C89/C90: unsigned char, unsigned short, unsigned int, and unsigned long. Therefore in C89/C90, size_t can be no wider than unsigned long, and therefore any value of type size_t can be converted to unsigned long without loss of information. (This applies only to unsigned long, not to long.)

This implicit guarantee vanished in C99, with the introduction of unsigned long long and of extended integer types. In C99 and later, size_t can be wider than unsigned long. For example, a C99 implementation might have 32-bit long and 64-bit long long, and make size_t an alias for unsigned long long.

Even in C89/C90, you can rely on the guarantee only if you have a conforming C89/C90 implementation. It was common for pre-C99 compilers to provide extensions on top of the C89/C90 standard -- for example a compiler might support long long, and might make size_t an alias for unsigned long long, even if the compiler didn't fully support the C99 (or C11) standard.

The question was about printf. Keep in mind that an argument to printf must be of an appropriate type for the format string. This:

printf("sizeof (int) = %lu\n", sizeof (int));

has undefined behavior unless size_t happens to be an alias for unsigned long (even if size_t and unsigned long happen to have the same size). You need to cast the value to the correct type:

printf("sizeof (int) = %lu\n", (unsigned long)sizeof (int));

For C99 and later, you can print size_t values directly:

printf("sizeof (int) = %zu\n", sizeof (int));

And if like, you can test the value of __STDC_VERSION__ to determine which one to use.

(A note on editions of the C standard. The first C standard was published in 1989 by ANSI. It was republished, with extra boilerplate sections added, by ISO in 1990. So C89 and C90 are two different documents that describe the same language. The later C99 and C11 standards were published by ISO. All three ISO C standards were officially adopted by ANSI. So strictly speaking "ANSI C" should refer to ISO C11 -- but for historical reasons the phrase is still used to refer to the 1989 standard.)

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631