To answer the original question: There was no need to add more than 48 Bits of PA.
Servers need the maximum amount of memory, so let's try to dig deeper.
1) The largest (commonly used) server configuration is an 8 Socket system. An 8S system is nothing but 8 Server CPU's connected by a high speed coherent interconnect (or simply, a high speed "bus") to form a single node. There are larger clusters out there but they are few and far between, we are talking commonly used configurations here. Note that in the real world usages, 2 Socket system is one of the most commonly used servers, and 8S is typically considered very high end.
2) The main types of memory used by servers are byte addressable regular DRAM memory (eg DDR3/DDR4 memory), Memory Mapped IO - MMIO (such as memory used by an add-in card), as well as Configuration Space used to configure the devices that are present in the system. The first type of memory is the one that are usually the biggest (and hence need the biggest number of address bits). Some high end servers use a large amount of MMIO as well depending on what the actual configuration of the system is.
3) Assume each server CPU can house 16 DDR4 DIMMs in each slot. With a maximum size DDR4 DIMM of 256GB. (Depending on the version of server, this number of possible DIMMs per socket is actually less than 16 DIMMs, but continue reading for the sake of the example).
So each socket can theoretically have 16*256GB=4096GB = 4 TB.
For our example 8S system, the DRAM size can be a maximum of 4*8= 32 TB. This means that
the max number of bits needed to address this DRAM space is 45 (=log2 32TB/log2 2).
We wont go into the details of the other types of memory (MMIO, MMCFG etc), but the point here is that the most "demanding" type of memory for an 8 Socket system with the largest types of DDR4 DIMMs available today (256 GB DIMMs) use only 45 bits.
For an OS that supports 48 bits (WS16 for example), there are (48-45=) 3 remaining bits.
Which means that if we used the lower 45 bits solely for 32TB of DRAM, we still have 2^3 times of addressable memory which can be used for MMIO/MMCFG for a total of 256 TB of addressable space.
So, to summarize:
1) 48 bits of Physical address is plenty of bits to support the largest systems of today that are "fully loaded" with copious amounts of DDR4 and also plenty of other IO devices that demand MMIO space. 256TB to be exact.
Note that this 256TB address space (=48bits of physical address) does NOT include any disk drives like SATA drives because they are NOT part of the address map, they only include the memory that is byte-addressable, and is exposed to the OS.
2) CPU hardware may choose to implement 46, 48 or > 48 bits depending on the generation of the server. But another important factor is how many bits does the OS recognize.
Today, WS16 supports 48 bit Physical addresses (=256 TB).
What this means to the user is, even though one has a large, ultra modern server CPU that can support >48 bits of addressing, if you run an OS that only supports 48 bits of PA, then you can only take advantage of 256 TB.
3) All in all, there are two main factors to take advantage of higher number of address bits (= more memory capacity).
a) How many bits does your CPU HW support? (This can be determined by CPUID instruction in Intel CPUs).
b) What OS version are you running and how many bits of PA does it recognize/support.
The min of (a,b) will ultimately determine the amount of addressable space your system can take advantage of.
I have written this response without looking into the other responses in detail. Also, I have not delved in detail into the nuances of MMIO, MMCFG and the entirety of the address map construction. But I do hope this helps.
Thanks,
Anand K Enamandram,
Server Platform Architect
Intel Corporation