20

I'm reading "Understanding Linux Kernel".

Paging for 64-bit Architectures

As we have seen in the previous sections, two-level paging is commonly used by 32-bit microprocessors. Two-level paging, however, is not suitable for computers that adopt a 64-bit architecture. Let's use a thought experiment to explain why:

Start by assuming a standard page size of 4 KB. Because 1 KB covers a range of 210 addresses, 4 KB covers 212 addresses, so the Offset field is 12 bits. This leaves up to 52 bits of the linear address to be distributed between the Table and the Directory fields. If we now decide to use only 48 of the 64 bits for addressing (this restriction leaves us with a comfortable 256 TB address space!), the remaining 48-12 = 36 bits will have to be split among Table and the Directory fields. If we now decide to reserve 18 bits for each of these two fields, both the Page Directory and the Page Tables of each process should include 218 entries that is, more than 256,000 entries.

  1. "If we now decide to use only 48 of the 64 bits for addressing". Why? & Why only 48 bits? Why not some other number?

  2. Well, I'm just a regular PC user & programmer. Its just hard to believe for me that 32-bit addressing i.e. 4GB (2GB/3GB to be more correct) address space per process is a limit. If you really encountered this limit. Please give me example.

  3. What is this limit for windows?

  4. I know that virtual memory != physical memory & processor address pins have nothing to do with virtual memory. This is a completely different question. How to know the number of address pins (= size of address bus) for a processor. http://ark.intel.com specifications of a processor doesn't include this spec.

Answer:

See Paul Betts's answer for reasonable answer for 1st question.

Community
  • 1
  • 1
claws
  • 52,236
  • 58
  • 146
  • 195
  • Also see http://stackoverflow.com/questions/3196684/memory-addressing & http://stackoverflow.com/questions/132930/32-vs-64-bits-whats-the-big-deal – claws Jul 10 '10 at 14:37

7 Answers7

10

None of these answers are right, the reason that OSs don't use the full 64-bits is because the page tables would be far larger (64-bit is already up to 3 levels of page tables), and there's no reason to pay the extra indirection needed, 48 bits is enough. 48-bits is also convenient because you get some extra bits to store flags in (pointer tagging)

Ana Betts
  • 73,868
  • 16
  • 141
  • 209
  • Your answer is also wrong. The amd64 architecture explicitly enforces all non used bits to 1 to make sure nobody is storing flags with the address and the page tables could just be increased by page size. 1GB page sizes are pretty nice cause they turn of virtual paging and make programs really fast (you can measure this on HP-UX) – Lothar Jul 14 '10 at 13:12
  • 1
    @Lothar You obviously can't *use* the pointer while it has flags, but in 32-bit, you don't have any "known" bits you can mask away before using the pointer. – Ana Betts Jul 14 '10 at 17:08
6

"If we now decide to use only 48 of the 64 bits for addressing". Why? & Why only 48bits? Why not some other number?

System architects make tradeoffs. 256TB seems like more than enough room for 1 process's address space. Remember virtual address != physical address, and generally speaking, each process has its own address space.

As long as pointers are 64 bits, this is more of a performance capability issue than anything else. If & when 48 bits becomes a limitation, the OS could be tweaked to use more bits of the 64-bit address space without breaking application incompatibility. For now, the architects are just buying themselves a very comfortable amount of time.

It may have to do with processor-side virtual addressing capabilities, as many processors now have memory management units to handle the virtual -> physical memory mapping.

How to know the number of address pins (= size of address bus) for a processor. http://ark.intel.com specifications of a processor doesn't include this spec.

This is for the most part irrelevant. It's a way for a processor to implement various physical addressing schemes. A 64-bit processor could achieve external address/data buses for its complete address space with 64, 32, 16, 8, 4, 2, or 1 address pin if the bus is synchronous and the address bits get multiplexed in time. Again, virtual address != physical address; 64-bit virtual addressing could be implemented with 48-bit or 32-bit physical addresses (just that you would be limited to 248 or 232 words of memory).

update: if you really want to know, you have to look at the datasheet of each processor in question. E.g. Intel Core 2 Duo -- section 4.2 of the datasheet talks about the signals -- the address bus is 36-bits wide (but is really 33 signal lines; the data width is 64-bit = 8 bytes so the other 3 lines are probably unnecessary with proper data alignment)

Well, I'm just a regular PC user & programmer. Its just hard to believe for me that 32-bit addressing ie.. 4GB (2GB/3GB to be more correct) address space per process is a limit. If you really encountered this limit. Please give me example.

Two words: memory-mapped files.

Jason S
  • 184,598
  • 164
  • 608
  • 970
  • `Remember virtual address != physical address`. Yeah, Now I recite it in my dreams too. :P. Yeah, 256TB seems to be large but its seems large for now. Once 1MB thought to be more than enough & 32-bit IP addresses exhausted in no time. When this 256TB will be a limit. People say, "now its time to move on to 64-bit". Damn! I hate it. – claws Jul 10 '10 at 14:48
  • :-) From a programmer's standpoint, as long as the pointers are 64-bits it doesn't matter. – Jason S Jul 10 '10 at 14:50
  • Jason: From a recompilation standpoint, sure. But it's only *really* true as long as the programmer didn't make any assumptions that are true at 48 but not at 64. :-) It sounds crazy that anyone would write code that works at 48 but not 64, but then, I've thought that same thing for every word-length and endianness change ever. – Ken Jul 10 '10 at 15:01
  • @claws, that'll be way off - if ever - on home PCs. 4GB is 4096x 1MB, but 256TB is _64,000x_ 4GB. I doubt any software will require / use close to that much RAM in 20 years -- short of iTunes, anyway. – Ben M Jul 10 '10 at 15:13
  • @Jason S: Please see 4th question. – claws Jul 10 '10 at 15:20
  • @Ken: why would a programmer make any assumptions about pointer values, other than checking against NULL and the width of the pointer itself? they're supposed to be opaque references to blocks of memory. – Jason S Jul 11 '10 at 02:00
  • 1
    Jason: That is a *great* question! I've asked exactly the same thing for the past 20+ years when porting C code between 8-, 16-, 32-, and 64-bit processors, little- and big-endian. I don't know why but a whole lot of them sure do! – Ken Jul 11 '10 at 02:25
  • Part of the 64bits used in addressing memory on a 64bits system are for using some CPU features such as security killbits. Much like file permissions the OS can tell the processor that some memory may be executed and other memory may never be by toggling the appropriate killbits, and the CPU will generate traps rather than execute data pages. – user268396 Jul 11 '10 at 06:31
  • @Jason S: Thanks. That was the exact answer I was looking for. But I didn't this part `(but is really 33 signal lines; the data width is 64-bit = 8 bytes so the other 3 lines are probably unnecessary with proper data alignment)` 1. What data alignment? 2. How did you deduce 'other 3 lines are probably unnecessary' from '64-bit data width lines'? 3. One more clarification: 33 address lines => max 8GB of RAM right? yesterday, while similar archives I've encountered some strange math: `2^33 * 8 bytes = 64 GB`in http://stackoverflow.com/questions/3196684/memory-addressing – claws Jul 11 '10 at 06:44
  • when in doubt, look at the datasheet. its address lines are A[36:3]. A[2:0] are missing. that means physical addresses in bytes are multiples of 8. it's a 64-bit = 8-byte data bus. QED. (and yes 2^36 = 64G) – Jason S Jul 11 '10 at 15:53
5

No current x86-64 design uses more than 48 bits for this -- so it's a convenient number to pick, and it's automatically the same limit on Windows, too.

Ken
  • 2,886
  • 20
  • 11
2

Its just hard to believe for me that 32-bit addressing ie.. 4GB (2GB/3GB to be more correct) address space per process is a limit. If you really encountered this limit. Please give me example.

It's more efficient (quicker) to get data from RAM than to get it from disk.

The speed of SQL server depends partly on how much data (e.g. how many of its index and data pages) it's able to keep in RAM instead of on disk.

So, SQL databases (for example) may be faster on machines with more than 4GB of RAM.

The same is true for other types of server (e.g. file servers, HTTP proxies, etc.), which can be faster if they can have larger RAM caches.

ChrisW
  • 54,973
  • 13
  • 116
  • 224
  • If you need more RAM. Intel has offered PAE long time ago. Which could support 64GB of RAM. Which means there is less need to swap pages to disk & all 4GB amount of pages can live in main memory. More over, in the current market I think 8GB is maximum RAM a board can support. How did moving 64-bit solve the problem? – claws Jul 10 '10 at 15:03
  • @claws - The Dell PowerEdge R910 for example supports up to 1 TB of RAM. – ChrisW Jul 10 '10 at 15:20
  • Thanks. I didn't know beasts like this even exist. http://www.dell.com/us/en/enterprise/servers/rack_optimized/cp.aspx?refid=rack_optimized&s=biz&cs=555 – claws Jul 10 '10 at 16:07
  • 1
    Well, even regular $100 boards support 16-24GB of RAM these days. – Thomas Kjørnes Jul 10 '10 at 20:08
  • 1
    @claws - i7 desktops typically support somewhere in the vicinity of 24GB of DDR3. The 8GB limit (mostly) went away with the core 2 line. – Donnie Jul 10 '10 at 20:09
2

I think the simplest answer is - moore's law.

Moore's law basically says that ICs halve in cost every 18 months. There are some ways of interpreting this: The amount of memory installed in a PC tends to double every 18 months. The effective speed doubles (at least if you take the cores * the MHz rather than just the MHz).

Anyway, weve just really run out of 32bit address space, so a jump from 32 - 48 means that, on the hardware side, we've allocated expansion space for about 16 iterations of Moore's law - which works out to about 20 years.

Im pretty sure that while some PCs might be pushed to the 10 year mark, 20 years of expansion headroom seems a decent tradeoff: Computers in 20 years time are going to be different - they won't be using the same CPUs and RAM busses, just as they were different 20 years ago. Designing more than 20 years worth of headroom into an interface is just silly over engineering that never going to see use anyway.

And its not so short that existing hardware runs a real risk of being obsoleted too soon.

Chris Becke
  • 34,244
  • 12
  • 79
  • 148
1

Its just hard to believe for me that 32-bit addressing ie.. 4GB (2GB/3GB to be more correct) address space per process is a limit. If you really encountered this limit. Please give me example.

It doesn't exist any more (except on some old employees personal machines) but I worked on a suite of software called RealiMation back in the late 1990s/early 2000s. It was a real time 3D engine for visualisation and simulation. One of our customers regularly created highly detailed models that hit the 2GB memory limit. We would load textures on the fly as and when needed and had to add code to check for memory allocation failure so we could continue displaying the model, albeit untextured.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
0

From a hardware prespective, another consideration is alignment.

Once you need a data type of more than 4 bytes, say 6, you need to put them on 8-byte boundries to retrieve them in a single instruction. If you don't align you need to do bit masking and shifting, and add checks for this in the (assembly) code.

Many people were annoyed at the switch to 64-bit that their programs consumed so much more memory. They would have wanted 48-bit pointers, and if the restrictions on alignment weren't there the CPU makers probably would have made a 48-bit architecture.

Note that if you are so starved for memory that you want your pointers to be 6 bytes there are ways to do that. But there is a penalty to execution time.

Erik van Velzen
  • 6,211
  • 3
  • 23
  • 23