Can a 32-bit processor really address 2^32 memory locations?

Question

I feel this might be a weird/stupid question, but here goes...

In the question Is NULL in C required/defined to be zero?, it has been established that the NULL pointer points to an unaddressable memory location, and also that NULL is 0.

Now, supposedly a 32-bit processor can address 2^32 memory locations.

2^32 is only the number of distinct numbers that can be represented using 32 bits. Among those numbers is 0. But since 0, that is, NULL, is supposed to point to nothing, shouldn't we say that a 32-bit processor can only address 2^32 - 1 memory locations (because the 0 is not supposed to be a valid address)?

Keep in mind the difference between real addresses and virtual addresses. The virtual memory of a process does not directly correspond to the addressable physical memory on the system. — Polynomial, Nov 28 '11 at 00:42
@Polynomial: The C standard knows nothing of virtual memory or processes. — Billy ONeal, Nov 28 '11 at 00:44
Is the question about C or about hardware? The tag says "C" but the title says "processor". Pick one. — Kerrek SB, Nov 28 '11 at 00:45
This is just a question of semantics; until you lay down some definitions, there's no way to answer. Whether or not null is a valid address depends completely on context. It's valid in the sense it exists and pointers can hold that value, but invalid in the sense no objects exist there. Neither of those is "correct", they're just different. — GManNickG, Nov 28 '11 at 00:45
You are asking if a processor can address but arguing, that C program cannot. — Roman Byshko, Nov 28 '11 at 00:46
The `0` address is special in C and at the OS level, but at the CPU level itself, the `0` address has no special meaning and hardware can directly address it. Note for example, x86 real mode, which can directly address `0`. — wkl, Nov 28 '11 at 00:47
Just to be clear, it is *perfectly* possible for a 32-bit processor to access more than 2^32 memory locations; that's exactly what [PAE](http://en.wikipedia.org/wiki/Physical_Address_Extension) is for. — Daniel Pryden, Nov 28 '11 at 16:39

score 9 · Answer 1 · answered Nov 28 '11 at 00:46

9

If a 32-bit processor can address 2^32 memory locations, that simply means that a C pointer on that architecture can refer to 2^32 - 1 locations plus NULL.

answered Nov 28 '11 at 00:46

Gabe

84,912
12
139
238

2

+1: The most straightforward answer. All this stuff about virtual memory is irrelevant. – Oliver Charlesworth Nov 28 '11 at 01:00
+1 for making my point in 1/100th the space in which I made it. – Billy ONeal Nov 28 '11 at 07:00

score 8 · Accepted Answer · answered Nov 28 '11 at 00:49

the NULL pointer points to an unaddressable memory location

This is not true. From the accepted answer in the question you linked:

Notice that, because of how the rules for null pointers are formulated, the value you use to assign/compare null pointers is guaranteed to be zero, but the bit pattern actually stored inside the pointer can be any other thing

Most platforms of which I am aware do in fact handle this by marking the first few pages of address space as invalid. That doesn't mean the processor can't address such things; it's just a convenient way of making low values a non valid pointer. For instance, several Windows APIs use this to distinguish between a resource ID and a pointer to actual data; everything below a certain value (65k if I recall correctly) is not a valid pointer, but is a valid resource ID.

Finally, just because C says something doesn't mean that the CPU needs to be restricted that way. Sure, C says accessing the null pattern is undefined -- but there's no reason someone writing in assembly need be subject to such limitations. Real machines typically can do much more than the C standard says they have to. Virtual memory, SIMD instructions, and hardware IO are some simple examples.

Thanks! Nice note about the resource IDs vs. pointers, I did not know that. — houbysoft, Nov 28 '11 at 01:35

Basile Starynkevitch · Answer 3 · 2011-11-28T00:53:27.307

0

It depends upon the operating system. It is related to virtual memory and address spaces

In practice (at least on Linux x86 32 bits), addresses are byte "numbers"s, but most are for 4-bytes words so are often multiple of 4.

And more importantly, as seen from a Linux application, only at most 3Gbytes out of 4Gbytes is visible. a whole gigabyte of address space (including the first and last pages, near the null pointer) is unmapped. In practice the process see much less of that. See its /proc/self/maps pseudo-file (e.g. run cat /proc/self/maps to see the address map of the cat command on Linux).

edited Nov 28 '11 at 00:53

answered Nov 28 '11 at 00:45

Basile Starynkevitch

223,805
18
296
547

1

No, it's mapped. It's just that part of it is reserved for kernel space. – Billy ONeal Nov 28 '11 at 00:47
Try `cat /proc/self/maps`; most of the address space **as seen by the application** (here the `cat` command) is not mapped. – Basile Starynkevitch Nov 28 '11 at 00:49
Okay, I suppose. But this really doesn't depend on the OS -- it depends more on your C implementation than on the OS. Some systems have standard C implementations across the system. Most systems dont. – Billy ONeal Nov 28 '11 at 00:52
No, address spaces & virtual memory are provided by the OS kernel. A program written in assembly or a language unrelated to C does also have them. – Basile Starynkevitch Nov 28 '11 at 00:55
But such non C applications need not treat the zero address as "special" – Billy ONeal Nov 28 '11 at 00:57
Yes, they do, because the kernel don't map page 0. – Basile Starynkevitch Nov 28 '11 at 00:58
Maybe on Linux. Certainly not on all platform/OS combinations. – Billy ONeal Nov 28 '11 at 00:59
Yes. 30 years ago, VAX/VMS mapped the null pointer. And a C program could dereference the NULL pointer -it got zero-. – Basile Starynkevitch Nov 28 '11 at 01:00
Or the PIC I received yesterday. – Billy ONeal Nov 28 '11 at 01:01

score 0 · Answer 4 · answered Nov 28 '11 at 00:53

First, let's note the difference between the linear address (AKA the value of the pointer) and the physical address. While the linear address space is, indeed, 32 bits (AKA 2^32 different bytes), the physical address that goes to the memory chip is not the same. Parts ("pages") of the linear address space might be mapped to physical memory, or to a page file, or to an arbitrary file, or marked as inaccessible and not backed by anything. The zeroth page happens to be the latter. The mapping mechanism is implemented on the CPU level and maintained by the OS.

That said, the zero address being unaddressable memory is just a C convention that's enforced by every protected-mode OS since the first Unices. In MS-DOS-era real-mode operaring systems, null far pointer (0000:0000) was perfectly addressable; however, writing there would ruin system data structures and bring nothing but trouble. Null near pointer (DS:0000) was also perfectly accessible, but the run-time library would typically reserve some space around zero to protect from accidental null pointer dereferencing. Also, in real mode (like in DOS) the address space was not a flat 32-bit one, it was effectively 20-bit.

C knows nothing of linear addresses versus physical addresses. (And for the record I've never heard that called a linear address; virtual address may work) A PIC (for instance) certainly doesn't have such things. Things like "protected mode" are x86 specific; the OP asked about all 32 bit architectures. — Billy ONeal, Nov 28 '11 at 00:55

Can a 32-bit processor really address 2^32 memory locations?

4 Answers4

Linked