I've read that in "most implementations" NULL points to 0, whatever that means.
No, it is 0; it's not a pointer to anything. So yes, NULL equ 0
is correct, or just push 0
.
In C source, (void*)0
is always NULL, but implementations are allowed to internally use a different non-zero bit-pattern for the object-representation of int *p = NULL;
. Implementations that choose a non-zero bit-pattern need to translate at compile time. (And the translation only works at compile time for compile-time integer constant expressions with value zero that appear in a pointer context, not for memset
or whatever.) The C++ FAQ has a whole section on NULL pointers. (Which also applies to C in this case.)
(It's legal in C to access the bit-pattern of an object with memcpy into an integer, or with (char*)
aliasing onto it, so it is possible to detect this in a well-formed program that's free from undefined behaviour. Or of course by looking at the asm or memory contents with a debugger! In practice you can easily check that the right asm for a NULL is by compiling int*foo(){return NULL;}
)
See also Why is address zero used for the null pointer? for some more background.
Why is there ambiguity? What is the equivalent to NULL in x86 instruction set?
In all x86 calling conventions / ABIs, the asm bit-pattern for NULL pointers is integer 0.
So push 0
or xor edi,edi
(RDI=0) is always what you want on x86 / x86-64. (Modern calling conventions, including all x86-64 conventions, pass args in registers.) Windows x64 passes the first arg in RCX, not RDI.
@J...'s answer shows how to push args in right-to-left order for the calling convention you're using, resulting in the first (left-most) arg at the lowest address.
Really you can store them to the stack however you like (e.g. with mov
) as long as they end up in the right place when call
runs.
The C standard allows it to be different because C implementations on some hardware might want to use something else, e.g. a special bit-pattern that always faults when dereferenced, regardless of context. Or if 0
was a valid address value in real programs, it's better if p==NULL
is always false for valid pointers. Or any other arcane hardware-specific reason.
So yes there could have been some C implementations for x86 where (void*)0
in the C source turns into a non-zero integer in the asm. But in practice there aren't. (And most programmers are happy that memset(array_of_pointers, 0, size)
actually sets them to NULL, which relies on the bit-pattern being 0
, because some code makes that assumption without thinking about the fact that it's not guaranteed to be portable).
This is not done on x86 in any of the standard C ABIs. (An ABI is a set of implementation choices that compilers all follow so their code can call each other, e.g. agreeing on struct layout, calling conventions, and what p == NULL
means.)
I'm not aware of any modern C implementations that use non-zero NULL
on other 32 or 64-bit CPUs either; virtual memory makes it easy to avoid address 0.
http://c-faq.com/null/machexamp.html has some historical examples:
The Prime 50 series used segment 07777
, offset 0
for the null pointer, at least for PL/I. Later models used segment 0
, offset 0 for null pointers in C, necessitating new instructions such as TCNP
(Test C Null Pointer), evidently as a sop to [footnote] all the extant poorly-written C code which made incorrect assumptions. Older, word-addressed Prime machines were also notorious for requiring larger byte pointers (char *
) than word pointers (int *
).
... see the link for more machines, and the footnote from this paragraph.
https://www.quora.com/On-which-actual-architectures-is-Cs-null-pointer-not-a-binary-zero-all-bits-zero reports finding a non-zero NULL on 286 Xenix, I guess using segmented pointers.
Modern x86 OSes make sure processes can't map anything into the lowest page of virtual address space, so NULL pointer dereference always faults noisily to make debugging easier.
e.g. Linux by default reserves the low 64kiB of address space (vm.mmap_min_address
). This helps whether it came from a NULL pointer in the source, or whether some other bug zeroed a pointer with integer zeros. 64k instead of just the low 4k page catches indexing a pointer as an array, like p[i]
with small to medium i
values.
Fun fact: Windows 95 mapped the lowest pages of user-space virtual address space to the first 64kiB of physical memory to work around a 386 B1 stepping erratum. But fortunately it was able to set things up so access from a normal 32-bit process did fault. Still, 16-bit code running in DOS compat mode could trash the whole machine very easily.
See https://devblogs.microsoft.com/oldnewthing/20141003-00/?p=43923 and https://news.ycombinator.com/item?id=13263976