55

Sometimes data at memory address 0x0 is quite valuable -- take x86 real mode IVT as a more known example: it starts at 0x0 and contains pointers to interrupt handlers: a dword at 0x00 is a pointer to division by zero error handler.

However, C11 language standard prohibits dereferencing null pointers [WG14 N1570 6.5.3.2], which are defined as pointers initialized with 0 or pointers initialized with a null pointer [WG14 N1570 6.3.2.3], effectively banning the very first byte.

How do people actually use 0x0 when it's needed?

hippietrail
  • 15,848
  • 18
  • 99
  • 158
gfv
  • 784
  • 7
  • 12
  • Are you referring to C++11? That standard specifically says that nullptr is not an integer at all. IE nullptr != 0x0. – GreenAsJade Jan 14 '14 at 01:47
  • Can you please reference the relevant portion of the C11 spec? – Mgetz Jan 14 '14 at 01:47
  • @Mgetz you'd want to look at sections "6.5.3.2 Address and indirection operators" with a footnote 102 for unary * dereferencing operator and "6.3.2.3 Pointers" for the definition of null pointer. – gfv Jan 14 '14 at 01:49
  • 2
    @GreenAsJade No, not C++ here, just plain C. – gfv Jan 14 '14 at 01:53
  • Ah - interesting, sorry my mistake. Googling around there's a lot of discussion of the very question you are asking here, but I didn't find an answer yet either! – GreenAsJade Jan 14 '14 at 01:54
  • @gfv: +1. Can you suggest link showing use of dereferencing 0x00 address. – Pranit Kothari Jan 14 '14 at 01:56
  • 7
    The null pointer is the pointer you get from an expression like `(void *)0`, but it is not *necessarily* the same as a pointer to address zero. – hobbs Jan 14 '14 at 02:01
  • Is the kernel space addressable from within an application that runs in the uerspace? Won't address 0x0 be relative to the base address of the userspace? – alvits Jan 14 '14 at 03:00
  • 2
    @alvits In real mode (16 bit mode), no. There is no separation of userspace and kernel space in real mode. – chbaker0 Jan 14 '14 at 03:46

5 Answers5

48

C does not prohibit dereferencing the null pointer, it merely makes it undefined behavior.

If your environment is such that you're able to dereference a pointer containing the address 0x0, then you should be able to do so. The C language standard says nothing about what will happen when you do so. (In most environments, the result will be a program crash.)

A concrete example (if I'm remembering this correctly): On the 68k-based Sun 3 computers, dereferencing a null pointer did not cause a trap; instead, the OS stored a zero value at memory address zero, and dereferencing a null pointer (which pointed to address zero) would yield that zero value. That meant, for example, that a C program could treat a null pointer as if it were a valid pointer to an empty string. Some software, intentionally or not, depended on this behavior. This required a great deal of cleanup when porting software to the SPARC-based Sun 4, which trapped on null pointer dereferences. (I distinctly remember reading about this, but I was unable to find a reference; I'll update this if I can find it.)

Note that the null pointer is not necessarily address zero. More precisely, the representation of a null may or may not be all-bits-zero. It very commonly is, but it's not guaranteed. (If it's not, then the integer-to-pointer conversion of (void*)0 is non-trivial.)

Section 5 of the comp.lang.c FAQ discusses null pointers.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • 3
    Actually that makes me wonder, is it ever not UB to assign an arbitrary number to a pointer and dereference? – Mgetz Jan 14 '14 at 02:11
  • Answers in the comp.lang.c FAQ look a bit too cumbersome: yes, *formally* they do not assign 0 to a pointer, but their spirit is filling space with zeros and, as you've noted, that's not always a null pointer representation. – gfv Jan 14 '14 at 02:14
  • I think in most of the cases, it invokes undefined behavior as lower section of memory is reserved for the addresses of operating system's subroutines(interrupt service routines). – haccks Jan 14 '14 at 02:16
  • @Mgetz to quote N1570 6.3.2.3, "An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation." – gfv Jan 14 '14 at 02:23
  • 2
    @gfv implementation defined is de-facto UB except without the possibility of demons through the nose – Mgetz Jan 14 '14 at 02:24
  • It might be noted that compilers intended for use in address spaces where 0 is a valid address, notably for building operating system kernels, typically support using 0 as an address. This is not always well documented. The actual address 0 and the null pointer might not be distinguished by the compiler, which leaves it up to the software authors to ensure they never need to distinguish them (never test for a null pointer when the pointer might actually be used to point to the 0 address). – Eric Postpischil Jan 14 '14 at 02:45
  • If you dereference a null pointer, you're going to have a bad time! – Mike Warren Jan 14 '14 at 08:55
  • @EricPostpischil You are absolutely correct. This frequently is the case in embedded systems. Cross-compiling is usually a bitch as not all compilers have a pragma or commandline switch to disable the compiler throwing a fit when you explicitly need to de-ref a pointer which the compiler could identify as containing 0x0. – Tonny Jan 14 '14 at 09:08
  • 2
    @MikeWarren: Not necessarily. The behavior is undefined, which specifically means that, as far as the C standard is concerned, anything can happen; a "bad time" is not the only possibility. Some systems have had a readable value 0 at address 0 (which caused loads of fun porting programs written for such systems to stricter systems that trapped on dereferencing null pointers). – Keith Thompson Jan 14 '14 at 14:47
  • @Mgetz: "*implementation defined is de-facto UB except without the possibility of demons through the nose*" -- Implementation-defined behavior means that the implementation is required to document the behavior. In many cases, there's a specified set of behaviors from which each implementation can choose. For undefined behavior, all bets are off. – Keith Thompson Jan 23 '18 at 03:44
  • @KeithThompson yes and no, Implementation defined is one step above UB true. But in terms of relying on it I consider it the same as UB because the implementation is free to change it at any time without warning. It just means they need to document what they changed it to. So are you right? Yes technically. But I would say from a programmer's perspective it's best to treat them as very similar (not identical though). TL;DR; you can only make assumptions about it for one toolset hardware combo any change to either and all bets are off. – Mgetz Jan 23 '18 at 14:06
  • @Mgetz: "*... the implementation is free to change it at any time without warning. It just means they need to document what they changed it to.*" -- That's exactly the opposite of "without warning". And in many cases, the implementation must choose from a limited set of behaviors; there's no need to worry about your program blowing up. – Keith Thompson Jan 23 '18 at 17:14
19

How do people actually use 0x0 when it's needed?

By either:

  • writing the required code in assembly language, or
  • writing the code in C and verifying that their compiler generates correct assembly language for the desired operation
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • when a pointer is made null or has 0x0 address does it physically point to 0x0 address?i.e when we consider OSes having virtual memory concept? – Koushik Shetty Jan 14 '14 at 09:42
  • @Koushik: No, virtual memory means that address 0x0 in a particular process address space does not necessarily point to physical address 0x0. – Greg Hewgill Jan 14 '14 at 09:43
  • If you're working on embedded systems that have only physical memory then yes, it does point at address 0x0. In the example Keith Thompson cited, if the memory physically didn't exist the MC68xxx series of CPUs would throw a bus error (exception) – Dan Haynes Jan 15 '14 at 00:23
  • Oops - timed out on comment editing: The memory at 0x0 in an MC68xxx system had to exist because that is where the reset vector lives. On power up, the CPU would fetch the 32 bit value from 0x0000000..0x000003 and load it into the stack pointer, then fetch 32 bits from 0x0000004..0x000007 and use that value as the initial instruction pointer... and then off to the races it would go. – Dan Haynes Jan 15 '14 at 00:31
9

The statement:

char * x = 0;

does not necessarily put 0x0 into x. It puts the defined null pointer value for the current architecture and compiler into x.

Now, in practical terms, all compilers / processors observed in common use do end up putting 32 (or 64) 0 bits in a row in a register or storage location in response to that statement, so, so if memory address 0 is useful, then, as others have indicated, you are stuck using formally undefined behavior. However, in once upon a time there was hardware out there for which a 'null pointer' was some bit pattern that was not all zeros, and, who knows, there may be again.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
  • An implementation of (Logitech, I think) Modula-2 I used years ago implemented the NIL pointer as FFFF:FFFF (segmented 16-bit land). Of course, it wasn't C and the rules are different anyway (ie. you can't just do `if (p) ...`). – Greg Hewgill Jan 14 '14 at 02:33
  • @Greg Fun with undefined behavior! `FFFF:FFFF` is even more situational than `0000:0000`, since it might be interpreted as linear address `10FFEF` or `00FFEF` depending upon whether address bit 20 is enabled, and stomping on what's at either of those locations could result in two different kinds of trouble. – Jeffrey Hantin Jan 14 '14 at 05:00
  • @JeffreyHantin: Not only that, reading (or writing) more than a byte at that address results in all kinds of weirdness. – Greg Hewgill Jan 14 '14 at 05:04
  • 2
    @GregHewgill Of course `if (p)` will work then, because it doesn't test for the 0 pattern, but indeed for the presence (resp. absence) of the `NULL` pointer pattern. – glglgl Jan 14 '14 at 12:30
  • 1
    @glglgl: Yes of course, but what I meant was that `if (p)` (with an *implicit* comparison against `NULL` or `nullptr`) is not valid Modula-2 syntax, and the equivalent would have to be `IF p # NIL` where the comparison is *explicit*. – Greg Hewgill Jan 14 '14 at 18:06
2

Annex J It is undefined behavior when...

The operand of the unary * operator has an invalid value (6.5.3.2).

In that same footnote you mentioned, it says a null pointer is an invalid value. Therefore, it is not prohibited, but undefined behavior. As for the distinction between address 0x0 and a null pointer, see Is memory address 0x0 usable?.

The null pointer is not necessarily address 0x0, so potentially an architecture could choose another address to represent the null pointer and you could get 0x0 from new as a valid address.

Whether the null pointer is reserved by the Operative System or the C++ implementation is unspecified, but plain new will never return a null pointer, whatever its address is (nothrow new is a different beast). So, to answer your question:

Is memory address 0x0 usable?

Maybe, it depends on the particular implementation/architecture.

In other words, feel free to use 0x0 if you're sure on your system that it won't cause a crash.

Community
  • 1
  • 1
  • _Formally_, undefined behaviour can include working with 0x0 like it is a normal memory, but relying on undefined behaviours can be painful in the future. – gfv Jan 14 '14 at 02:25
  • @gfv The important thing is that there's a distinction. Whether or `0x0` is safe to use is a case-by-case basis. –  Jan 14 '14 at 02:29
2

The operating system use a table of pointers to interrupt routines to call appropriate interrupt(s). Generally (in most operating system) table of pointers is stored in low memory (the first few hundred or so locations), These locations hold the addresses of the interrupt service routines for the various devices.

So when you do

char *ptr = 0x0; 

then most likely you are initializing your pointer with the address of an interrupt service routine. Dereferencing (or modifying) a memory location which belongs to operating system most likely cause program to crash.
So, better not to initialize a pointer to 0x0 and dereference it until you have the confirmation that it doesn't belongs to OS.

haccks
  • 104,019
  • 25
  • 176
  • 264
  • 1
    What if you are actually writing the OS? You still need a way to do this sort of thing. – Greg Hewgill Jan 14 '14 at 02:56
  • @GregHewgill; True. But in general you can't dereference address belongs to OS. – haccks Jan 14 '14 at 02:58
  • 2
    Isn't there a separation between kernel space and user space? – alvits Jan 14 '14 at 03:02
  • @haccks - please enlighten me. if an app is running in the userspace, won't address 0x0 be relative to the userspace base address? – alvits Jan 14 '14 at 03:03
  • @alvits; Who told you that. Suppose an application is running in user space which needs to request a system call then what will happen? – haccks Jan 14 '14 at 03:07
  • 1
    @alvits; I hope you are well aware of *Dual-Mode-Operation*, i.e, kernel mode and user mode. When you run your application program then your system is in user mode. When it requests for a system call then the transition occurs from user mode to kernel mode to fulfill the request. – haccks Jan 14 '14 at 03:13
  • @alvits; *Isn't there a separation between kernel space and user space?*; Yes. Of course. There is. – haccks Jan 14 '14 at 03:17
  • Glad to here that it helped you. :) – haccks Jan 14 '14 at 03:22