69

According to §6.3.2.3 ¶3 of the C11 standard, a null pointer constant in C can be defined by an implementation as either the integer constant expression 0 or such an expression cast to void *. In C the null pointer constant is defined by the NULL macro.

My implementation (GCC 9.4.0) defines NULL in stddef.h in the following ways:

#define NULL ((void *)0)
#define NULL 0

Why are both of the above expressions considered semantically equivalent in the context of NULL? More specifically, why do there exist two ways of expressing the same concept rather than one?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Brad Jones
  • 815
  • 7
  • 10
  • 4
    In case of GCC, the shorter definition (`0`) is for C++. `` may be included by both C and C++ source files. – pts Dec 21 '22 at 03:47
  • 8
    If `(void*)0` is used for *NULL*, that can catch bugs like `int x; ... if (x == NULL) {...}`. With `0`, this the code would compile without warnings. – pts Dec 21 '22 at 03:52
  • 9
    IIRC the earliest C versions didn't know `void` at all. A `void*` wasn't an option then. – Gerhardh Dec 21 '22 at 07:47
  • 6
    `NULL` is a macro provided by the standard library that expands to an otherwise unspecified null pointer constant. Do not confuse `NULL`, a specific macro, with null pointer constants in general. – John Bollinger Dec 21 '22 at 17:55
  • 4
    @Gerhardh - Back In Pre-ANSI-Standard Days (tm) C was less strongly typed, and assignment of equal-byte-sized objects without a cast was considered normal. And since integers and pointers were commonly 32 bits in size, assigning integers to pointers and pointers to integers was a common thing. As someone once put it, "Strong typing is for weak minds". And we liked it that way! WE **LOVED** IT!!! :-) – Bob Jarvis - Слава Україні Dec 22 '22 at 16:07
  • Does this answer your question? [What is the difference between NULL, '\0' and 0?](https://stackoverflow.com/questions/1296843/what-is-the-difference-between-null-0-and-0) – Karl Knechtel Jan 17 '23 at 04:18
  • @KarlKnechtel, yes: that partially answers my question, – Brad Jones May 11 '23 at 06:36

9 Answers9

54

Let's consider this example code:

#include <stddef.h>
int *f(void) { return NULL; }
int g(int x) { return x == NULL ? 3 : 4; }

We want f to compile without warnings, and we want g to cause an error or a warning (because an int variable x was compared to a pointer).

In C, #define NULL ((void*)0) gives us both (GCC warning for g, clean compile for f).

However, in C++, #define NULL ((void*)0) causes a compile error for f. Thus, to make it compile in C++, <stddef.h> has #define NULL 0 for C++ only (not for C). Unfortunately, this also prevents the warning from being reported for g. To fix that, C++11 uses built-in nullptr instead of NULL, and with that, C++ compilers report an error for g, and they compile f cleanly.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
pts
  • 80,836
  • 20
  • 110
  • 183
  • 6
    I hope that ```nullptr``` will be added to the next version of C. – Harith Dec 21 '22 at 06:20
  • 2
    Would there be any problem in having `#define NULL nullptr` for C++? – nielsen Dec 21 '22 at 07:21
  • 7
    This doesn't explain why C allows `0` as a null pointer constant however. – Lundin Dec 21 '22 at 07:23
  • 10
    @Haris, `nullptr` and `nullptr_t` are added to the latest C23 draft. See https://www9.open-std.org/JTC1/SC22/WG14/www/docs/n3054.pdf – tstanisl Dec 21 '22 at 10:17
  • It's a pain working with code that has `char c = NULL;` where the intent is `char c = '\0';` — the plain 0 version of `NULL` works fine, but the cast version does not. For a long time, C used `#define NULL 0` which is why such code existed and worked. – Jonathan Leffler Dec 21 '22 at 13:59
  • And apparently there is so much code like that [that `#define NULL nullptr` cannot be mandated in C++](https://stackoverflow.com/questions/32302615/why-not-call-nullptr-null). (I fear that C will be painted into the same corner if ever it adopts `nullptr`.) – Steve Summit Dec 21 '22 at 14:19
  • 6
    @JonathanLeffler That particular issue always felt to me like a conflation between (1) `NULL` meaning the pointer value and (2) `NUL` meaning the name of character 0 from [the ***C0 control code*** names](https://en.wikipedia.org/wiki/C0_and_C1_control_codes#Basic_ASCII_control_codes), the same set that also brought us `ACK` and `NAK` and `BEL` and `BS` and `FF` and such. (Those are now further memorialized as graphic symbols in the Unicode `Control_Pictures` block starting at U+2400 `SYMBOL FOR NULL`, but that doesn't matter here.) – tchrist Dec 21 '22 at 16:34
  • 1
    @Lundin NULL is 0 as C was first written on a DEC PDP and 0 is not a valid memory address on the PDP or something similar - I can't find the correct reference but ) was chosen as it behaved differently on a PDP. On some machine NULL used not necessarily be 0 – mmmmmm Dec 21 '22 at 19:46
  • @mmmmmm: I'm yet to hear about any architecture where the binary representation of `NULL` is not all zeros. All the popular ones have zeros. – pts Dec 21 '22 at 20:04
  • @SteveSummit: With some mainstream implementations such as GCC already defining `NULL` in C as `((void*)0)`, confused code like `char c = NULL;` is already broken on that implementation. GCC should have no problem changing to `nullptr` for C. – Peter Cordes Dec 21 '22 at 20:05
  • @PeterCordes I hope you're right. (I'm not in favor of leaving `NULL` defined as plain `0`, mind you, and I'm appalled beyond words that C++ did *not* mandate `nullptr` as the new definition of `NULL`.) – Steve Summit Dec 21 '22 at 20:10
  • 4
    @pts Run, don't walk, to https://c-faq.com/null/machexamp.html . – Steve Summit Dec 21 '22 at 20:11
  • 3
    @mmmmmm It's true that 0 for NULL arose on the PDP-11, but not because it was an invalid memory address. 0 was a perfectly valid memory address in those days, and IIRC it wasn't until BSD Unix implemented virtual memory in the 1980's that it became an option (and eventually standard practice) to arrange that the page containing address 0 wasn't mapped in. – Steve Summit Dec 21 '22 at 20:28
  • @SteveSummit Ok that makes sense - I lean't about 0 as being from C or Unix design making assumptions so sound like it was BSD and I knew that NULL was not 0 on some machines so the articles I saw claimed it was a misfeature making 0 act as NULL. – mmmmmm Dec 21 '22 at 21:02
  • @SteveSummit: Oh bother, `int x = ((void*)0)` does compile in C, just with a warning (on by default in GCC even without `-Wall`). I was assuming it wouldn't when I commented earlier. https://godbolt.org/z/dMYq8ceTT . So you're right, C23 compilers may also choose not to redefine `NULL` to `nullptr` because of nonsensical code like that. – Peter Cordes Dec 21 '22 at 21:08
  • 3
    @mmmmmm Claiming that it's a misfeature to use 0 for null pointers, on a machine with nonzero actual null pointers, betrays a misunderstanding, IMO. An analogy: in floating point, the bit pattern for `1.0` looks nothing like binary `0001`, and there have been machines where the bit pattern for `0.0` was not all-bits-0. Yet `float f = 0.0;` (and similarly `float f = 0;`) must clearly work as expected at a high level, meaning that the compiler is going to have to generate a nonzero bit pattern, if necessary, behind the scenes in the initialized data segment. – Steve Summit Dec 22 '22 at 04:02
  • 1
    @SteveSummit: What's unfortunate is the lack of any distinction between implementations where all-bits-zero is a valid null pointer and those where it isn't, in a manner that would allow e.g. `#if __STDC_ALL_BITS_ZERO_NULLS pointers = calloc(sizeof (int*), 100); #else pointers = malloc(100 * sizeof(int*)); if (pointers) for (size_t i=0; i – supercat Dec 22 '22 at 17:55
34

((void *)0) has stronger typing and could lead to better compiler or static analyser diagnostics. For example since implicit conversions between pointers and plain integers aren't allowed in standard C.

0 is likely allowed for historical reasons, from a pre-standard time when everything in C was pretty much just integers and wild implicit conversions between pointers and integers were allowed, though possibly resulting in undefined behavior.

Ancient K&R 1st edition provides some insight (7.14 the assignment operator):

The compilers currently allow a pointer to be assigned to an integer, an integer to a pointer, and a pointer to a pointer of another type. The assignment is a pure copy operation, with no conversion. This usage is nonportable, and may produce pointers which cause addressing exceptions when used. However, it is guaranteed that assignment of the constant 0 to a pointer will produce a null pointer distinguishable from a pointer to any object.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 2
    `int *x = 0;` does not mean that `x` consists only with zeroed bits. Conversion from any other constant integer literal or any variable will raise a warning. I think that the real reason for `NULL` are implicit conversions like one in `printf`. – tstanisl Dec 21 '22 at 10:19
  • 7
    @tstanisl Nobody (including K&R 1st ed) claimed that either. Null pointers and null pointer constants are different things. – Lundin Dec 21 '22 at 11:40
  • _"from a pre-standard time <...> possibly resulting in undefined behavior"_ — the concept of undefined behavior was introduced with the standard, so it wouldn't make sense to apply it to pre-standard implementations. – Ruslan Dec 21 '22 at 13:44
  • @Ruslan So call it non-portable code or what you will. A bug by any other name will stink just as bad. – Lundin Dec 21 '22 at 13:52
  • It's not necessarily a bug. Type punning used to be **much** more common than it is now. Even now it appears to be so important that C++20 introduced `std::bit_cast` (and C allows type punning via unions). – Ruslan Dec 21 '22 at 13:55
  • @Ruslan It's a bug if the intention is to write conforming C or portable code. Also the relation between data size and address bus width used to be way more shaky back in the days, so assuming that they had the same size would be very naive. It wasn't really until 32 bitters became mainstream (somewhere around the Intel 386 and 486 era, which is also around the time of ISO C) that someone finally settled for using 32 bit data width and 32 bit address bus in the same core. And even then weird non-standard addressing were still common. – Lundin Dec 21 '22 at 14:07
  • 2
    You might add that `0` and `((void *)0)` are not equivalent when passed to vararg functions such as `execl`. Systems where `int` and `void *` (or `char*`) have different widths or parameter passing conventions should define `NULL` as `((void *)0)` to avoid undefined behavior on `execl("/bin/ls", NULL)` and similar calls. – chqrlie Dec 21 '22 at 19:24
  • If everything about the state of every object `X` whose address is observable is encapsulated in the bit pattern held in `sizeof X bytes starting at address `&X`, that fact would define the behavior of type punning for all types which don't have trap representations. – supercat Dec 22 '22 at 00:00
  • @Lundin: I think you are confusing the concepts of "*Strictly* Conforming C Program" and Conforming C Program. – supercat Dec 22 '22 at 09:22
  • @supercat Not at all. `int x = ((void*)0);` is a constraint violation and may therefore not be present in a strictly conforming program. Should a conforming implementation allow this code to pass without diagnostics, as an implementation-defined extention, it will alter the behavior of a strictly conforming program - suddenly every kind of wild assignment goes. Such an implementation is not conforming. – Lundin Dec 22 '22 at 09:55
  • @supercat Furthermore the standard expliticly says (5.1.1.3) "A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint." – Lundin Dec 22 '22 at 09:56
  • @Lundin: True, but the Standard also allows an implementation to accept a program after having issued such a diagnostic, or would allow an implementation to issue diagnostics even when given a strictly conforming program, and does not require that implementations make any distinction between those cases. If there exists within the universe a conforming C implementation that accepts a collection of source texts, then *by definition* that collection of source texts is a "conforming C program". – supercat Dec 22 '22 at 16:40
  • @Lundin: I'll readily admit that the definition of "conforming C program" is so broad as to be essentially meaningless, but that is by deliberate design. If before the C Standard was ratified, all implementations that could process a construct meaningfully would do so, but on 1% of implementations meaningful treatment would be impractical, such a state of affairs would have been expected to continue after the Standard was ratified. Declaring the construct illegitimate would have broken a lot of code, and declaring it legitimate would have make the Standard impractical to implement on... – supercat Dec 22 '22 at 16:49
  • ...some platforms. Thus, the Standard decided to remain completely agnostic as to the legitimacy of many constructs which have since become controversial. – supercat Dec 22 '22 at 16:51
  • Maybe I missed something, but I read all the answers and none of them [addressed this issue](https://stackoverflow.com/q/32431033/5358284). It might be irrelevant but still, the question title was a bit misleading to think that we can #define the same symbol for two different things. – polfosol ఠ_ఠ Dec 25 '22 at 12:28
16

Few things in C are more confusing than null pointers. The C FAQ list devotes an entire section to the topic, and to the myriad misunderstandings that eternally arise. And we can see that those misunderstandings never go away, as some of them are being recycled even in this thread, in 2022.

The basic facts are these:

  1. C has the concept of a null pointer, a distinguished pointer value which points definitively nowhere.
  2. The source code construct by which a null pointer is requested — a null pointer constant — fundamentally involves the token 0.
  3. Because the token 0 has other uses, ambiguity (not to mention confusion) is possible.
  4. To help reduce the confusion and ambiguity, for many years the token 0 as a null pointer constant has been hidden behind the preprocessor macro NULL.
  5. To provide some type safety and further reduce errors, it's attractive to have the macro definition of NULL include a pointer cast.
  6. However, and most unfortunately, enough confusion crept in along the way that properly mitigating it all has become almost impossible. In particular, there is so very much extant code that says things like strbuf[len] = NULL; (in an obvious but basically wrong attempt to null-terminate a string) that it is believed in some circles to be impossible to actually define NULL with an expansion including either the explicit cast or the hypothetical future (or extant in C++) new keyword nullptr.

See also Why not call nullptr NULL?

Footnote (call this point 3½): It's also possible for a null pointer — despite being represented in C source code as an integer constant 0 — to have an internal value that is not all-bits-0. This fact adds massively to the confusion whenever this topic is discussed, but it doesn't fundamentally change the definition.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • 2
    Just for the record, `int x = ((void*)0);` does compile in C, just with a warning (on by default in GCC even without `-Wall`). https://godbolt.org/z/dMYq8ceTT So GCC's current definition of `NULL` is compatible with confused legacy code that misuses `NULL` in non-pointer contexts, like in C++ except with warnings. But such code would break if compilers defined NULL as C23 `nullptr`, so unfortunately compilers will probably choose not to do that, just like in the C++ question you linked, [Why not call nullptr NULL?](https://stackoverflow.com/q/32302615) – Peter Cordes Dec 21 '22 at 21:55
  • Re: non-zero object-representation: it doesn't change *anything* about the `NULL` macro definition. It means code like `memset(ptr_array, 0, len)` isn't fully portable (whether or not it uses `NULL` instead of `0`; the middle arg is an `int` whose low `char` will be used as the fill pattern, so no definition of NULL can do anything about it.) Now I'm wondering about whether `char *str_array[4] = {NULL };` initializes the last 3 elements to null pointers, or to an object-representation of all-zeros on an implementation where that's not the same thing. Doesn't affect NULL's definition, though. – Peter Cordes Dec 21 '22 at 22:02
  • Comments on [How to write C/C++ code correctly when null pointer is not all bits zero](https://stackoverflow.com/posts/comments/52249339) say that ISO C and C++ requires that static storage like `static int *arr;` is initialized as if with `= 0`, thus a non-zero bit-pattern is required if that's how null pointers are represented. But another commenter remembers a real system where nulls were non-zero bit patterns but uninitialized static storage was filled with binary zeros. This is getting off-topic for the definition of `NULL`, sorry. – Peter Cordes Dec 21 '22 at 22:21
  • 1
    @PeterCordes Thanks for all the comments. Briefly: (1) `int x = NULL;` is less common and more obviously wrong than `char c = NULL;`, so I can hold out hope there's less of a chance someone will think they need to preserve it. (2) I deliberately avoided mentioning `memset`, to keep things a bit tidier. (3) These days I believe `char *str_array[4] = {NULL};` must be interpreted as if `char *str_array[4] = {NULL,0,0,0};` meaning you're guaranteed to get proper null pointers, but whether that was likely/true back in the day when there were machines with nonzero null pointers is another question. – Steve Summit Dec 22 '22 at 03:53
  • (4) As for "another commenter remembers", I think I remember that discussion, too, and my conclusion is that a compiler for such a system is/was simply going to have to put those uninitialized pointers in the initialized data segment. (Just like uninitialized floating-point variables on a machine where `0.0` is not all-bits-0.) – Steve Summit Dec 22 '22 at 03:55
  • Yeah, absolutely, putting zero-initialized things in the `.bss` section is totally normal, and something compiler writers might want to do even if it violates the C standard by not initializing pointers to be null pointers. Either by accident that then becomes unchangeable without breaking some existing code for that platform, or as an intentional tradeoff to not bloat executables for source that assumes `void *records[100000]` won't take space in the executable. – Peter Cordes Dec 22 '22 at 04:06
  • 2
    @PeterCordes There's no such thing as "does compile just a warning". A warning means "fatal bug or invalid C here! now you need to fix it", it doesn't mean "here's some cosmetic detail you can worry about a rainy day". As for what the compiler is obliged to do when it finds blatantly incorrect C, giving a warning is fine. [What must a C compiler do when it finds an error?](https://software.codidact.com/posts/277340) – Lundin Dec 22 '22 at 07:21
  • 1
    As for why `int x = ((void*)0);` specifically is invalid C, see ["Pointer from integer/integer from pointer without a cast" issues](https://stackoverflow.com/questions/52186834/pointer-from-integer-integer-from-pointer-without-a-cast-issues) – Lundin Dec 22 '22 at 07:23
  • @Lundin: We know that warning is something that needs fixing, but unfortunately the unwise people who originally put things like `char c = NULL` and `int i = NULL` into legacy codebases either didn't, or were using an implementation that defined NULL as integer `0`. The fact that current GCC compiles code like that means there can be (and probably is) a substantial amount of legacy code with that bug, which wouldn't build at all with compilers that made different implementation choices. – Peter Cordes Dec 22 '22 at 07:24
  • @PeterCordes No it's because gcc is lax against implicit pointer to integer conversions. `int* p = 123;` gives the same kind of warning. So it's not some backwards compatibility feature for null pointers, it's just gcc deciding to generate an executable anyway, even though it spotted invalid C with constraint violations. – Lundin Dec 22 '22 at 07:27
  • @Lundin: Ok yes, thanks for clearing that up, that's already invalid in ISO C like I expected it would be, GCC is just being lax. But the consequence is still basically the same, that legacy codebases may contain code like this because current implementations accept it (and vice versa). With current GCC `-std=gnu2x`, `int i = nullptr;` is rejected, since it's a conversion from `nullptr_t` rather than from a pointer type. https://godbolt.org/z/5rPnhqMof . IDK how much benefit `nullptr` brings to C where a `((void*)0)` definition already catches any non-pointer uses if warnings are heeded. – Peter Cordes Dec 22 '22 at 07:37
  • @Lundin I'm not going to get into a long argument about it here, as I know you stand by your opinion, but there *is* such a thing as "compiles with just a warning", and a warning does *not* necessarily mean "fatal bug or invalid C". A warning might mean "here's a cosmetic detail you can worry about later". Examples: unused variables; `if(a = b)`; misleading indentation. (And there are many others.) – Steve Summit Dec 22 '22 at 15:38
13

There is just one way to express NULL in C, it's a single 4-character token.
But hold on, when going into its definition it gets more interesting.

NULL has to be defined as a null pointer constant, meaning an integer constant with value 0 or such cast to void*.
As an integer constant is just an expression of integer type with a few restrictions to guarantee static evaluation, there are infinite possibilities for any wanted value.

Of all those possibilities, only an integer literal with value 0 is also a null pointer constant in C++, for what it's worth.

The reason for such variation is history and precedent (everyone did it differently, void* was late to the party, and existing code/implementations trumps all), reinforced with backwards-compatibility which preserves it.

6.3.2.3 Pointers

[...] An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.
67) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. [...]

6.6 Constant expressions

[...] Description
2 A constant expression can be evaluated during translation rather than runtime, and accordingly may be used in any place that a constant may be.
Constraints 3 Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated.117)
4 Each constant expression shall evaluate to a constant that is in the range of representable values for its type.
Semantics
5 An expression that evaluates to a constant is required in several contexts. If a floating expression is evaluated in the translation environment, the arithmetic range and precision shall be at least as great as if the expression were being evaluated in the execution environment.118)
6 An integer constant expression119) shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts.
Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof or _Alignof operator.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • 2
    "But there are infinite ways that NULL may be defined by the implementation, even if we restrict us to standard C" No, there are only two. `0` or `(void*)`. When 7.19 says "NULL which expands to an implementation-defined null pointer constant" it means either of these two (or flavours of them such as `0L`), since those are the only null pointer constants. The binary representation of a _null pointer_ however can be anything. [What's the difference between null pointers and NULL?](https://software.codidact.com/posts/278657) – Lundin Dec 22 '22 at 07:36
  • 5
    @Lundin Any integer constant expression with value 0, or such cast to `void*`. That allows for arithmetic, logic, ternary operator, any number of parentheses, enums, non-evaluated sub-expressions, .... Which does not mean pushing it is a good idea. – Deduplicator Dec 22 '22 at 07:44
  • 1
    @Lundin An implementation is not prohibited from creating its own null pointer constants and defining `NULL` to expand to one of them… – user3840170 Dec 22 '22 at 22:13
  • @user3840170 It can do anything it likes in the realm of non-standard language extensions, obviously. Doing so is not explicitly implementation-defined behavior but a language extension. – Lundin Dec 23 '22 at 07:39
  • 1
    @Lundin If it added a keyword for a null pointer constant which is tagged for extra scrutiny to ensure it is only used in pointer-contexts, maybe calling it `__null` and using it for `NULL`, it would still be conforming though, as it doesn't change the semantics of any strictly conforming program. – Deduplicator Dec 23 '22 at 10:28
  • #define NULL (9-3*3) is perfectly fine. – gnasher729 Dec 23 '22 at 10:31
8

C was originally developed on machines where a null pointer constant and the integer constant 0 had the same representation. Later, some vendors ported the language to mainframes where a different special value triggered a hardware trap when used as a pointer, and wanted to use that value for NULL. These companies discovered that so much existing code type-punned between integers and pointers, they had to recognize 0 as a special constant that could implicitly convert to a null pointer constant. ANSI C incorporated this behavior, at the same time as they introduced the void* as a pointer that implicitly converts to any type of object pointer. This allowed NULL to be used as a safer alternative to 0.

I’ve seen some code that (possibly tongue-in-cheek) detected one of these machines by testing if ((char*)1 == 0).

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • 4
    A related issue arises when passing arguments to non-prototyped or variadic functions. If one has a function which accepts a variable number of pointer arguments which must be followed by a null pointer, passing 0 as the last argument will work on platforms where pointers and int share the same representation, but may fail very badly on platfoms where e.g. integers get passed using one 16-bit stack slot while pointers use two 16-bit stack slots. – supercat Dec 21 '22 at 23:57
  • @supercat Very true! Another common gotcha is zeroing out a block of memory that holds a pointer, such as with `memset()` or `calloc()`. – Davislor Dec 22 '22 at 00:59
8

why do there exist two ways of expressing the same concept rather than one?

History.

NULL started as 0 and later better programming practices encouraged ((void *)0).


First, there are more than 2 ways:

#define NULL ((void *)0)
#define NULL 0
#define NULL 0L
#define NULL 0LL
#define NULL 0u
...

Before void * (Pre C89)

Before void * and void existed, #define NULL some_integer_type_of_zero was used.

It was useful to have the size of that integer type to match the size of object pointers. Consider the below. With 16-bit int and 32-bit long, it is useful for the type of zero used to match the width of an object pointer.

Consider printing pointers.

double x;
printf("%ld\n", &x);  // On systems where an object pointer was same size as long
printf("%ld\n", NULL);// Would like to use the same specifier for NULL

With 32-bit object pointers, #define NULL 0L is better.

double x;
printf("%d\n", &x);  // On systems where an object pointer was same size as int
printf("%d\n", NULL);// Would like to use the same specifier for NULL

With 16-bit object pointers, #define NULL 0 is better.


C89

After the birth of void, void *, it is natural to have the null pointer constant to be a pointer type. This allowed the bit pattern of (void*)0) to be non-zero. This was useful in some architectures.

printf("%p\n", NULL);

With 16-bit object pointers, #define NULL ((void*)0) works above.
With 32-bit object pointers, #define NULL ((void*)0) works.
With 64-bit object pointers, #define NULL ((void*)0) works.
With 16-bit int, #define NULL ((void*)0) works.
With 32-bit int, #define NULL ((void*)0) works.
We now have independence of the int/long/object pointer size. ((void*)0) works in all cases.

Using #define NULL 0 creates issues when passing NULL as a ... argument, hence the irksome need to do printf("%p\n", (void*)NULL); for highly portable code.

With #define NULL ((void*)0), code like char n = NULL; will more likely raise a warning, unlike ``#define NULL 0`


C99

With the advent of _Generic, we can distinguish, for better or worse, NULL as a void *, int, long, ...

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
7

According to §6.3.2.3 ¶3 of the C11 standard, a null pointer constant in C can be defined by an implementation as either the integer constant expression 0 or such an expression cast to void *.

No, that a misleading paraphrase of the language spec. The actual language of the cited paragraph is

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. [...]

Implementations don't get to choose between those alternatives. Both are forms of a null pointer constant in the C language. They can be used interchangeably for the purpose.

Moreover, not only the specific integer constant expression 0 can serve in this role, but any integer constant expression with value 0 can do. For example, 1 + 2 + 3 + 4 - 10 is such an expression.

Additionally, do not confuse null pointer constants generally with the macro NULL. The latter is defined by conforming implementations to expand to a null pointer constant, but that doesn't mean that the replacement text of NULL is the only null pointer constant.

My implementation (GCC 9.4.0) defines NULL in stddef.h in the following ways:

#define NULL ((void *)0)
#define NULL 0

Not both at the same time, of course.

Why are both of the above expressions considered semantically equivalent in the context of NULL?

Again with the reversal. It's not "the context of NULL". It's pointer context. There is nothing particularly special about the macro NULL itself to distinguish contexts in which it appears from contexts where its replacement text appears directly.

And I guess you're asking for rationale for paragraph 6.3.2.3/3, as opposed to "because 6.3.2.3/3". There is no published rationale for C11. There is one for C99, which largely serves for C90 as well, but it does not address this issue.

It should be noted, however, that void (and therefore void *) was an invention of the committee that developed the original C language specification ("ANSI C" / C89 / C90). There was no possibility of an "integer constant expression cast to type void *" before then.

More specifically, why do there exist two ways of expressing the same concept rather than one?

Are there, really?

If we accept an integer constant expression with value 0 as a null pointer constant (a source-code entity), and we want to convert it to a runtime null pointer value, then which pointer type do we choose? Pointers to different object types do not necessarily have the same representation, so this actually matters. Type void * seems the natural choice to me, and that's consistent with the fact that, alone of all pointer types, void * can be converted to other object pointer types without a cast.

But then, in a context where 0 is being interpreted as a null pointer constant, casting it to void * is a no-op, so (void *) 0 expresses exactly the same thing as 0 in such a context.

What's really going on here

At the time the ANSI committee was working, many existing C implementations accepted integer-to-pointer conversions without a cast, and although the meaning of most such conversions was implementation and / or context specific, there was wide acceptance that converting constant 0 to a pointer yielded a null pointer. That use was by far the most common one of converting an integer constant to a pointer. The committee wanted to impose stricter rules on type conversions, but it did not want to break all the existing code that used 0 as a constant representing a null pointer.

So they hacked the spec.

They invented a special kind of constant, the null pointer constant, and provided rules around it that made it compatible with existing use. A null pointer constant, regardless of lexical form, can be implicitly converted to any pointer type, yielding a null pointer (value) of that type. Otherwise, no implicit integer-to-pointer conversions are defined.

But the committee preferred that null pointer constants should actually have pointer type without conversion (which 0 does not, pointer context or no), so they provided for the "cast to type void *" option as part of the definition of a null pointer constant. At the time, that was a forward-looking move, but the general consensus now appears to be that it was the right direction to aim.

And why do we still have the "integer constant expression with value 0"? Backwards compatibility. Consistency with conventional idioms such as {0} as a universal initializer for objects of any type. Resistance to change. Perhaps other reasons as well.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • I think the OP (like many others) are confused about the terms _null pointer_ vs the NULL macro. The implementation must indeed treat pointers assigned to either `0` or `(void*)0` as null pointers, but it does get to choose if it wants to define NULL as the null pointer constant `0` or the null pointer constant `(void*)0`. – Lundin Dec 22 '22 at 07:30
  • I think so too, @Lundin, and this answer speaks towards such a misunderstanding in a couple of places. In particular, "do not confuse null pointer constants generally with the macro `NULL`", pointing out that a null-pointer constant is a source-code entity, and discussing the conversion of null pointer constants into runtime values all do this. Perhaps that's not enough, I dunno. – John Bollinger Dec 22 '22 at 13:41
3

The "why" - it is for historical reasons. NULL was used in various implementations before it was added to a standard. And at the time it was added to a C standard, implementations defined NULL usually as 0, or as 0 cast to some pointer. At that point you wouldn't want to make one of them illegal, because whichever you made illegal, you'd break half the existing code.

gnasher729
  • 51,477
  • 5
  • 75
  • 98
-2

The C11 standard allows for a null pointer constant to be defined either as the integer constant expression 0 or as an expression that is cast to void *. The use of the NULL macro makes it easier for programmers to use the null pointer constant in their code, as they don't have to remember which of these definitions the implementation uses.

Using a macro also makes it easier to change the underlying definition of the null pointer constant in the future, if necessary. For example, if the implementation decided to change the definition of NULL to be a different integer constant expression, they could do so by simply modifying the definition of the NULL macro. This would not require any changes to the code that uses the NULL macro, as long as the code uses the NULL macro consistently.

There are two definitions of the NULL macro provided in the example you gave because some systems may define NULL as an expression that is cast to void *, while others may define it as the integer constant expression 0. By providing both definitions, the stddef.h header can be used on a wide range of systems without requiring any modifications.