18

From the C17 draft (6.3.2.3 ¶3):

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.67) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

67)The macro NULL is defined in <stddef.h> (and other headers) as a null pointer constant [...].

From this, it follows that the following are null pointer constants: 0, 0UL, (void *)0, (void *)0UL, NULL.

It further follows that the following are null pointers: (int *)0, (int *)0UL, (int *)(void *)0, (int *)(void *)0UL, (int *)NULL. Interestingly, none of these are "null pointer constants"; see here.

The following null pointer constants are null pointers (because void * is a pointer type and 0 and 0UL are null pointer constants): (void *)0, (void *)0UL. In this regard, according to the C17 draft (6.2.5 ¶19-20):

The void type comprises an empty set of values; it is an incomplete object type that cannot be completed.
[...]
A pointer type may be derived from a function type or an object type, called the referenced type. [...] A pointer type is a complete object type.

void is not a pointer type itself, and it is an incomplete object type. But void * is a pointer type.

But it seems that the following are null pointer constants which are not null pointers (because there is no cast to a pointer type): 0, 0UL, NULL. (To be precise, while the standard only requires that NULL be defined as "a null pointer constant", it would be permissible to define it as a null pointer constant which is also a null pointer. But it seems that the standard doesn't require NULL to be defined in such a way that it is simultaneously a null pointer.)

Is every null pointer constant a null pointer? (Is NULL really not a null pointer?)

Finally (and somewhat tongue-in-cheek): In case certain null pointer constants are not null pointers, would they technically be a kind of "non-null pointer"? (This wording appears in some places in the standard.) Note that linguistically we have a so-called bracketing paradox; we can read this as "[non-null] pointer" or "non-[null pointer]".

Lover of Structure
  • 1,561
  • 3
  • 11
  • 27
  • 11
    I feel like this is the CS equivalent of philosophy majors trying to define "what it means to exist". I'd stand on the side of "the null pointer constants 0, 0L, and NULL are not null pointers", because there is nothing to imply they point to any data until they are casted. There's more of a case for NULL being considered a null pointer than 0 or 0L, NULL as a value implies the capacity to hold data, which is similar to what one would expect of pointers, while 0 and 0L are perfectly valid for initialized variables. – Gumpf May 10 '23 at 14:43
  • 7
    I do not understand. You state that `the following are null pointer constants which are not null pointers` and then you follow `Is every null pointer constant a null pointer?`. You stated the answer. – KamilCuk May 10 '23 at 14:51
  • @KamilCuk What I state is that "it *seems* that [...]" – that is, I placed the respective statement under the scope of a modal operator (a term from modal logic) to indicate uncertainty. It is very much possible that I misread or missed relevant parts of the standard. – Lover of Structure May 10 '23 at 14:54
  • 1
    Och, sure. The thing is, `void` did not exist, so there have been `#define NULL 0`. So that both `NULL (void*)0` and `NULL 0` are correct, there is this "null pointer constant is an integer or pointer". Something not mentioned is that it is constant expressions with the value 0, think about `1-1` `1*0` etc. Also see https://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf page 144 . Also if one does not know https://stackoverflow.com/questions/49615338/why-does-1-int1-voidx-0l-work-correctly – KamilCuk May 10 '23 at 14:58
  • 3
    Linguistically, I personally always read hyphenation as "binding tighter" than spaces. So "non-null pointer" always looks to me to be saying "a pointer that is non-null", rather "a thing that is not a null pointer". I acknowledge this isn't universally adhered to, but I would always find another way to phrase it if I wanted the "thing that is not a null pointer" meaning, **especially** in technical writing. – Ben May 11 '23 at 03:22
  • @Ben I agree with your sentiment. In English (but not other languages), people (in professional typesetting) often use an en-dash ("post–World War II", "non–null pointer") for the non-intuitive bracketing, and in German they hyphenate the whole compound (*Durchkopplung:* "Harry-Potter-Roman" because writing "Happy Potter-Roman" would suggest X[YZ]-grouping instead of the actual [XY]Z, because hyphens *look like they bind more tightly than spaces*). See [this answer of mine](https://tex.stackexchange.com/a/60038/14996) on TeX.SE. – Lover of Structure May 11 '23 at 03:41
  • 6
    Don't forget `nullptr` [from C23](https://en.cppreference.com/w/c/language/nullptr). *The keyword nullptr denotes a predefined null pointer constant. It is a non-lvalue of type nullptr_t. nullptr can be converted to a pointer types or bool, where the result is the null pointer value of that type or false respectively.* – Aykhan Hagverdili May 11 '23 at 08:03
  • 5
    This is the reason why you're supposed to cast `NULL` to a pointer type when passing it to a variadic function, e.g. the last argument in `execl()`. – Barmar May 11 '23 at 14:43

6 Answers6

24

Is every null pointer constant a null pointer?

TL;DR: no.

As you have already observed, integer constant expressions with value 0 are null pointer constants, despite not having pointer type. You have also quoted the specification's definition of null pointer: "a null pointer constant [] converted to pointer type". That means that null pointer constants of this general form ...

(void *)(<integer constant expression with value 0>)

... satisfy the definition of "null pointer". The integer constant expression is a null pointer constant itself, so the cast makes the overall expression a null pointer (in addition to being a null pointer constant).

On the other hand, null pointer constants that take the form of integer constant expressions with value 0 do not satisfy the definition of "null pointer", and there is no other provision in the language spec that would make them null pointers. Examples: 0, 0x00UL, 1 + 2 + 3 - 6.

it seems that the standard doesn't require NULL to be defined in such a way that it is simultaneously a null pointer.

Correct.

Is every null pointer constant a null pointer?

Definitely not (see above), but for most purposes, it does not matter.

(Is NULL really not a null pointer?)

It depends on your C implementation. The language spec allows either answer. In practice, it is a null pointer in most implementations you're likely to meet.

In case certain null pointer constants are not null pointers, would they technically be a kind of "non-null pointer"?

No. Null pointer constants that are not null pointers are not pointers at all. They are integers.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
10

Is every null pointer constant a null pointer?

No, and the reason why is in the text you quoted:

If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

A null pointer constant is not automatically a pointer, just like any integer constant is not automatically a pointer. The constant value must be converted to a pointer type to produce a null pointer.

The resulting null pointer does not have to be zero-valued. It only has to be a value that cannot be the address of any object or function. That value may be 0x00000000 (and on the implementations I'm familiar with it is), or it may be 0xFFFFFFFF, or it may be 0xDEADBEEF, or it may be something else.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • 1
    Most modern architectures do encourage null pointers to be zero valued, such that a type-agnostic zero-filled buffer reinterpreted as a pointer (or structure containing a pointer field) will be a valid null pointer. The language does permit other values (in which case code can get in trouble if it omits constructors or otherwise plays too loose with the rules, if it ever finds itself on one of these architectures). Even in modern times there are some good reasons why an architecture might choose a non-zero null pointer underlying value, though, even though it's a bit more work. – Miral May 11 '23 at 04:49
  • In the Symbolics Lisp Machine C implementation, the value of a null pointer was something like `#A(NIL 0)` – Barmar May 11 '23 at 14:40
  • 1
    "The resulting null pointer does not have to be zero-valued" is misleading to novices in a very important way: _regardless of the bit representation of a null pointer_, the C standard requires it to compare equal to zero, and to be treated as false by `if`, `&&`, etc. – zwol May 11 '23 at 15:14
  • 1
    The C Standard requires that a null pointer compare equal to the null pointer constant. It does not require that it compare equal to anything else associated with the concept of "zero". – supercat May 11 '23 at 19:29
  • @supercat Can you think of an example of a valid expression, in which a null pointer is compared for equality with something which is "zero" but not itself a null pointer, where the result is not required to be "equal", and which does not involve explicitly inspecting the bit representation (e.g. with `memcmp(&null_ptr, ...)`) ? – zwol May 12 '23 at 16:18
  • @supercat: The C Standard requires that the pointer-comparison operation indicates equality between one null pointer and another null pointer or null pointer constant. It doesn't require that other sorts of comparison (like `memcmp`) give a "these are equal" result. – Ben Voigt May 12 '23 at 16:19
  • @BenVoigt I'm like 85% sure that all possible expressions which compare a null pointer and some other piece of data which is "zero" in some sense, fall into three classes: (1) required to report equality; (2) constraint violation; (3) explicitly inspect the bit representation of the null pointer (e.g. `memcmp` would do this). – zwol May 12 '23 at 16:22
  • 1
    @zwol: How about `(uintptr_t)somePointer == 0`, or `ptr == (void*)someUintPtr`, where `someUintPtr` is a non-constant expression of type `uintptr_t`? Note that if `someUintPtr` was formed by converting a null-pointer to type `uintptr_t`, a pointer formed by converting that value back to `(void*)` must compare equal to a null pointer, but the `uintptr_t` value might not be zero, and on implementations where converting a null pointer to `uintptr_t` would yield a non-zero value, converting a `uintptr_t` with value 0 to a pointer might not yield a null pointer. – supercat May 12 '23 at 19:50
  • @zwol: An implementation where the normal representation of a null pointer was 0xABCD could process conversions between `uintptr_t` and pointer types by xor'ing the bit pattern with 0xABCD so that a `uintptr_t` value of 0 would represent a null pointer, but it might also process such conversions in representation-preserving fashion. Neither approach can really be described as unambiguously better than the other, and the Standard waives judgment as to which should be preferred on platforms where the natural representation of a null pointer isn't all bits zero. – supercat May 12 '23 at 19:58
  • @supercat I would describe any operation involving a cast to or from `[u]intptr_t` as "inspecting the bit representation". – zwol May 12 '23 at 20:09
  • @zwol: Many implementations process conversions between pointers and integers in representation-preserving fashion, but the Standard makes no distinction between those that do, and those where there is no discernible relationship between pointer and integer values. – supercat May 12 '23 at 20:18
  • 1
    @supercat - I've tested this long time ago and what I'm almost sure happens is this - the non zero value of the pointer will be preserved - and will be stored in the uintptr_t. It's simply that the comparison with the null pointer constant will check for said value instead of 0 (internally and in all cases - it's part of the implementation). I may post a question and answer it myself with this topic sometime (with said implementation - which btw is not obsolete as well - and anyone will be able to easily verify). – AnArrayOfFunctions May 13 '23 at 12:06
  • Although tbh said implementation uses non standard keyword for declaring such pointers (the null pointer constant is still 0 - just when compared with such pointer it have different behaviour) - so it might not be the most conforming. So yeah I'm debating whatever to create such question in this scenario. – AnArrayOfFunctions May 13 '23 at 12:12
5

The null pointer constant may be a void * or some integer type.

Test on your machine:

#include <stdio.h>
#include <stdlib.h>

#define NULL_TEST(n) _Generic((n), \
  void *: "void *", \
  int: "int", \
  long: "long", \
  default: "something else" \
)

int main(void) {
  printf("%s\n", NULL_TEST(NULL));
  printf("%s\n", NULL_TEST((void*)0));
  printf("%s\n", NULL_TEST(0));
  printf("%s\n", NULL_TEST(0L));
}

On my machine, I had the below output. Your output may vary for the first line.

void *
void *
int
long
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 4
    "Test on your platform" is rarely a good answer to a [tag:language-lawyer] question. First sentence is correct, though. – Toby Speight May 11 '23 at 13:48
  • 1
    Why use `printf` instead of `puts` here? – Aykhan Hagverdili May 12 '23 at 20:19
  • 1
    @AyxanHaqverdili For over 10 years, I have seen good compilers emit the same code for `printf("...\",string);` and `puts(string);` This readily makes the choice between the 2 a style choice. So code whatever your like and trust the compiler to emit good code. One or the other is a side issue to this question. I tend to code with `puts()` when other `printf()` are not lurking about. Learners tend to understand `printf()` easier. Your call. – chux - Reinstate Monica May 14 '23 at 00:48
5

No. In fact, no null pointer constant is a null pointer! This is because constants and pointers are different kinds of entities.

A null pointer constant is a constant expression which has a particular form. An expression is a sequence of tokens, and null pointer constants are defined as sequences of tokens that have a particular form.

A null pointer is a value. In C, each type has its set of potential values. For each pointer type, one or more value in that set is a null pointer. The C standard does not define the concept of value formally. A formal semantics would need to do this (and formally defining values of pointers gets rather complicated, which is why the C standard, an English document written without mathematics, doesn't try).

An expression evaluates to a value in a context (possibly causing side effects). All null pointer constants whose type is a pointer type evaluate to a null pointer. Some null pointer constants (e.g. 0, 1L - 'z' / 'z') have an integer type, and those do not evaluate to a null pointer: they evaluate to a null integer (i.e. an integer with the value 0 — the C standard does not use the expression “null integer” because it isn't anything remarkable that would need a specific name).

The C standard guarantees that if e is a constant expression with an integer type and the value 0, then any expression that converts this value to a pointer type evaluates to a null pointer. Note that this guarantee is not given for arbitrary expressions: (void*) f() might not be a null pointer even if f is defined as int f(void) { return 0; }.

The C standard allows NULL to have either an integer type or a pointer type. If it has a pointer type, the expression NULL evaluates to a null pointer. If it has an integer type, it doesn't.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
3

Another fun one is that '\0' has type int (6.4.4.3(10)) and “The numerical value of the octal integer so formed specifies the value of the desired character or wide character” ((5)), and the same holds of a hexadecimal escape. So both '\0' and '\x0' are null pointer constants as well. In addition, “floating operands that are the immediate operands of casts” (which must be casts of an arithmetic type to an integral type) are among the legal “integer constant expressions,” so (int)0.0 is a null pointer constant. So could be enum values, the results of sizeof (although all standard types have a size of at least 1, some compilers have zero-size fields as an extension) and _Alignof (although the Standard says this can only return positive powers of 2 and an alignment of 0 is ignored), and the results of an operator whose operands are integral types, for example X^X or !1.

Several modern compilers define NULL to be a special keyword, such as __null in gcc, or nullptr if cross-compiled on C++. This lets the compiler catch an error if a program uses NULL where an integral constant or void* might implicitly be converted to an expression that is not a pointer, such as a boolean.

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • 1
    I don't think either `sizeof` or `_Alignof` expressions can evaluate to 0. But a null pointer constant can *include* such expressions as part of a larger integer constant expression. – John Bollinger May 11 '23 at 13:25
  • @JohnBollinger By the standard, “Every valid alignment value shall be a nonnegative integral power of two.” Also, “Except for bit-fields, objects are composed of contiguous sequences of one or more bytes,” and a bitfield isn’t allowed to have a size of 0 either. However, I’m pretty sure that compilers are allowed to have extensions for which `sizeof` returns zero. One common extension is a field with no unique address, intended to have zero bytes of storage, and the implementation might then want the size of all elements of the `struct` not to exceed the size of the `struct`. – Davislor May 11 '23 at 15:51
  • Ok, but this being a language-lawyer question, if you want to delve into non-conforming extensions then you should at least make note that you are doing so. – John Bollinger May 11 '23 at 17:00
2

C was designed in such a way that, at least on the platforms for which it was originally targeted, pointers and integers could be treated essentially interchangeably in most contexts. Given char *p; int i;, a compiler processing p=0; would process it essentially the same way as i=0;, except that the former would write the value 0 to the address of p, while the latter would store the value 0 to the address of i. There was no need for a compiler to understand the concept of a null pointer, because the same compiler logic that would be used to set i to the numerical value zero could just as effectively set p to a value that wouldn't be associated with any object and would behave as the value zero.

The way the C Standard is written does not allow the type of an expression to vary depending upon the context where it is used. While it might make sense to say that the right-hand operand of the assignment operator in p=0; would have pointer type, and that the right-hand operator in i=0; would have integer type, the design of the Standard requires that they both have the same type. Because there is no "normal" type that could be used in both contexts, the authors of the C Standard created a special "type" for expressions which should be equally usable in both contexts. I think the term "null pointer" constant is more confusing than necessary, and that "universal zero" would be clearer, since what the zero represents isn't just the number zero, or a null pointer, or an all-zeroes bit pattern, but more generally the default value of a static-duration object.

supercat
  • 77,689
  • 9
  • 166
  • 211