I'm a little confused about Intel Optane DC. I want that my Optane DC will be able to perform as DRAM and storage both. On the one hand, I understood that only "Intel Optane DC Persistent Memory DIMM" is able to perform as DRAM.That it because he has 2 modes (Memory mode and App-Direct Mode). On the other hand, in this link: https://www.intel.com/content/www/us/en/products/docs/memory-storage/solid-state-drives/optane-ssd-dc-p4800x-mdt-brief.html I read that "Together, DRAM and Intel® Optane™ SSDs with Intel® Memory Drive Technology emulate a single volatile memory pool". I'm confused, is Intel Optane DC SSD is able to perform as DRAM or only Intel persistent Memory DIMM?
1 Answers
Yes you can use a P4800x with Intel's IMDT (Intel Memory Drive Technology) software to give the illusion of more RAM by using the Optane DC SSD as swap space. This is what you want. IMDT sets up a hypervisor that gives the OS the illusion of DRAM + SSD as physical memory, instead of just letting the OS use it as swap space normally.
Apparently this works well when you already have enough physical RAM for most of your working set, and IMDT has smart prefetching algorithms that try to page in ahead of when a page will be needed.
One advantage to running the OS under the IMDT hypervisor instead of just using the SSD as swap space is that it will get the OS to use some of that extra space for pagecache (aka disk caching), instead of needing special code to use (some of) an SSD as cache for a slower disk.
But no, it's not Optane DC Persistent Memory, that's something else.
See also a SuperUser answer for more about Optane vs. Optane DC PM. And Hadi Brais added some nice sections to it about IMDT for Optane SSDs.
P4800x is connected over PCI-express (as you can see in pictures on https://www.anandtech.com/show/11930/intel-optane-ssd-dc-p4800x-750gb-handson-review) for example. So it's not an NV-DIMM; you can't stick it in a DIMM socket and have the CPU access it over the memory bus. The form-factor isn't DIMM.
As far as hardware, there are 3 things with the Optane brand name:
Consumer grade "Optane" SSDs. Just a fast PCIe NVMe using 3D XPoint memory instead of NAND flash.
Enterprise "Optane DC" SSDs. Just a fast PCIe NVMe using 3D XPoint memory. Not fundamentally different from the consumer stuff, just faster and higher power-consumption. P4800x is this.
The "expand your RAM" functionality here is pure software, fairly similar (and possibly worse) than just creating a swap partition on it and letting the OS handle paging to it. Especially if you weren't using virtualization already.
Enterprise "Optane DC Persistent Memory" (PM for short). 3D XPoint memory that's truly mapped (by hardware) into physical address space for access with ordinary load/store instruction, without going through a driver for each read/write. e.g. Linux
mmap(MAP_SYNC)
and usingclflush
orclwb
asm instructions in user-space to commit data to persistent storage.PM is still slower than DRAM, though, so if you just want volatile memory you might still use it as swap space like IMDT. One key use-case for DC PM is giving databases the ability to commit to persistent storage without going through the OS. This allows out-of-order execution around I/O, as well as much lower overhead.
See articles like https://www.techspot.com/news/79483-intel-announces-optane-dc-persistent-memory-dimms.html which put Optane DC Persistent Memory above Optane DC in the classic pyramid storage hierarchy.
AFAIK, Optane DC PM devices only exist in a DIMM form-factor, not PCIe (and uses something like DDR4 signalling). This requires special support from the CPU because modern CPUs integrate the memory controller.
In theory you could have a PCIe device that exposed some persistent storage in a PCIe memory region. Those are part of physical address space and can be configured as write-back cacheable. (Or can they? Mapping MMIO region write-back does not work) So they could be memory-mapped into userland virtual address space. But I don't think any PCIe Optane DC Persistent Memory devices exist, probably because PCIe command latency is (much) higher than over the DDR4 bus. Bandwidth is also lower. So it makes sense to use it as fast swap space (copying in a whole page), not as write-back cacheable physical memory where you could have cache misses waiting a very long time.
(Margaret Bloom also comments re: block size of writes maybe being a problem.)
i.e. you don't want a "hot" part of your working set on memory that the CPU accesses over the PCIe bus. You probably don't even want that for Optane DC PM.
Optane / 3D XPoint is always persistent storage; it' up to software whether you take advantage of that or just use it as slower volatile RAM.
It's not literally DRAM, that has a specific technical meaning (dynamic = data stored in tiny capacitors that need refreshing frequently). 3D XPoint isn't dynamic, and isn't even volatile. But you can use it as equivalent because 3D XPoint memory has very good write endurance (it doesn't wear out like NAND flash). If people talk about using Optane as more DRAM, they're using the term to just mean volatile RAM, filling the same role that DRAM traditionally fills.

- 1
- 1

- 328,167
- 45
- 605
- 847
-
1I think the OP is interested in the Optane DC PM, but when they looked it up, P4800x turned up in the search results, which is an SSD, not PM, resulting in the confusion. The terms "Optane DC PM", "Apache Pass", and "3D XPoint" are not synonyms. Optane DC PM is a brand for DIMM-compatible persistent memory modules. Apache Pass is a particular design, representing the first generation of Optane DC PM. This is similar to the Intel Core brand vs. microarchitecture. 3D XPoint is the memory technology used in the Optane DC PM and SSD products. – Hadi Brais Oct 02 '19 at 16:28
-
Future generations of Optane DC PM and SSD were disclosed by Intel in [these](https://newsroom.intel.com/wp-content/uploads/sites/11/2019/09/Intel-2019_MemoryStorageDay_KristieMann_final.pdf) slides. – Hadi Brais Oct 02 '19 at 16:29
-
@HadiBrais: So none of their PCIe devices are HW memory-mappable? It's just fast NVMe swap space + some software to fool the OS? Only the "PM" stuff truly puts the storage in physical address space? (And BTW, in my answer I did try to use 3D XPoint when talking specifically about the tech used for the memory cells, not just the products built out of them. But I was fooled by DC vs. DC Persistent Memory.) – Peter Cordes Oct 02 '19 at 16:45
-
Right, the Optane SSDs are not accessed through memory-mapped I/O (but the files can certainly be memory-mapped, similar to HDD files). In contrast, Optane PC can be accessed like DRAM memory using load and store uops. I think the answer you linked to already says that. The answer there just incorrectly says that "Optane DC PM" and "Apache Pass" are synonyms. – Hadi Brais Oct 02 '19 at 16:58
-
@HadiBrais: Ah, thanks for that correction for my SuperUser answer, will update. So there are no PCIe "Optane DC PM" devices that expose PCIe physical memory space, only NVMe? Anyway, `mmap` of block devices into virtual address space is irrelevant and a red herring in this context. As you know, mmaping a file (without `MAP_SYNC`) just copies to RAM and fakes it with software (page faults and page-table dirty bit). The interesting part is mapping storage into *physical* address space, allowing persistent commit without OS interaction. – Peter Cordes Oct 02 '19 at 18:06
-
Got myself a little bit confused. I don't care about the speed, I just need the ability to perform as RAM. Will the SSD be able to do that? If so, the main difference between the SSD and DIMM is the speed and form factor? The main thing that concerns me is the ability of the system (Using SPDK) to see it as RAM, even if this is using software tricks and not being true DRAM. This article: https://arstechnica.com/gadgets/2018/05/intel-finally-announces-ddr4-memory-made-from-persistent-3d-xpoint/ presents the ability of using it as DRAM as something special. What am I missing? – Yoni Goikhman Oct 02 '19 at 19:10
-
1I've made an edit to your SuperUser answer to add more information. – Hadi Brais Oct 02 '19 at 20:53
-
1@YoniGoikhman Yes, Intel Optane DC Persistent Memory can do that. you're just confusing two very different things: Optane DC PM and Optane DC SSD are different products with different purposes. Also you're question is not clear. First you said "I don't care about the speed" then you're asking whether the main difference between SSD and DIMM is speed. Anway, SSDs will not appear as RAM to the OS or any app unless you're using IMDT, discussed in the linked answer. – Hadi Brais Oct 02 '19 at 21:05
-
@HadiBrais: Thanks for the edit on SU. Reviewers rejected it but I was still able to approve it manually after the fact. – Peter Cordes Oct 03 '19 at 02:23
-
@HadiBrais Thank you, I think that IMDT is what I was looking for. I need it to perform as storage device and DRAM as well. From what I understand, if combining the SSD and IMDT, the SSD will be able to perform as RAM? I don't need the full speed of RAM, just his abilities and functionality when needed. – Yoni Goikhman Oct 03 '19 at 04:19
-
@YoniGoikhman: yes, IMDT is apparently a hypervisor that uses the SSD as swap space to create a VM that looks like it has lots of RAM. IDK why that's better than letting the OS do it; maybe Intel figures they have better algorithms, and/or it's outside of any VMs vs. having OSes inside VMs do paging. – Peter Cordes Oct 03 '19 at 04:25
-
I think because Intel doesn't want to make IMDT open-source and they intend to only support their SSDs to promote them. @YoniGoikhman If this answers your question, consider upvoting and accepting the answer. Otherwise, clarify your question by editing it. – Hadi Brais Oct 03 '19 at 12:34
-
@HadiBrais: Good point about open-source and restricting it to their own SSDs. But what does IMDT do that OSes don't already do for themselves with swap partitions? Just custom prefetch / replacement algorithms? i.e. I'm saying IMDT doesn't need to exist at all, except for the marketing point of claiming it works like RAM. (Or IDK if hypervisors like Xen can do paging of VMs. Sharing the whole SSD dynamically in the hypervisor is better than giving each VM a swap partition) – Peter Cordes Oct 03 '19 at 12:38
-
1Agreed. I think IMDT can be completely implemented in the OS. This would even eliminate the requirement for and the overhead of virtualization. Also IMDT [appears](https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/intel-mdt-setup-guide.pdf) to require a non-trivial amount of setup, which can possibly be minimized had it been implemented in the OS. – Hadi Brais Oct 03 '19 at 12:42
-
1@HadiBrais: updated this answer significantly, removing my wrong guess about PCIe memory regions. On further thought, IMDT would have the advantage of getting the OS to use more pagecache, i.e. using the Optane SSD to cache the main storage. OSes won't normally do that with a swap partition / pagefile. Not sure how relevant that is for enterprise workloads, especially when it has to be accessed by paging into DRAM and then reading from there. – Peter Cordes Oct 03 '19 at 15:04
-
As far as I understand it, there exist devices that expose themselves as a MMIO persistent range, but not for "free". This is called the BLK (block) mode in the NVDIMM and ACPI specifications. I'm not sure but I suspect that single word write is still not possible with PM (especially if it's huge) so an intermediate buffer and some coordination is needed. The Optane DC PM has all onboard and don't need any coordination from the SW (though the DDR-T interface used by the HW is not documented and may do some coordination?) but other devices have explicit apertures where to map data and commands. – Margaret Bloom Oct 03 '19 at 15:57
-
@MargaretBloom: If your MMIO persistent storage is mapped write-back cacheable, you'd be committing 64-byte lines with `clflushopt` or `clwb`. I think this is the expected mode of use, although since you mention it I guess uncacheable, WC, or even WT could be options. I think 3D XPoint is internally byte-addressable, but IDK about the bus command protocol. What I was wondering about in my answer was whether there are any (Intel) devices that expose an MMIO persistent range *over PCIe*, rather than over DDR-T. Thanks for the extra details on how actual NV-DIMMs are specced, though. – Peter Cordes Oct 03 '19 at 16:42
-
1That may be possible (see the links [here](https://stackoverflow.com/questions/53311131/mapping-mmio-region-write-back-does-not-work)). But I think the I/O device itself has to be designed to handle fine-grained accesses (such as 64-byte granularity). This is usually supported for device commands sent from the driver to transfer blocks of data, but not for direct data accesses. Otherwise, the only alternative is emulation like IMDT. – Hadi Brais Oct 03 '19 at 17:35
-
I was thinking more on the side of 1-4KiB for the PM "pages" if they share anything with the usual NV devices (i.e. flash ROMs). I've not digged into PM yet, just making hypothesis. I don't really know how block (as per spec) devices work :) One day I'll take a better look at the spec and hopefully a datasheet. – Margaret Bloom Oct 03 '19 at 21:03
-
@MargaretBloom: My understanding is that 3D XPoint truly can be read/written with fine granularity, totally unlike flash's large write / erase blocks. Along with high write endurance (that might make a wear-leveling / remapping layer unnecessary?) and high performance, that is a key part of what makes it usable as memory-mapped NV storage. https://pcper.com/2017/06/how-3d-xpoint-phase-change-memory-works/ says "single bit overwrites are possible without disturbing adjacent cells." It literally doesn't have to be erased before write. – Peter Cordes Oct 04 '19 at 03:32
-
@PeterCordes That's seems the case indeed. Very cool link, thank you! – Margaret Bloom Oct 04 '19 at 06:53