11

I am wondering how the oncoming SSD technology affects (mosty system) programming. Tons of questions arise, but here are some most obvious ones:

  • Can the speed of disk access be considered anywhere near to the memory speed?
  • If not, is it either just a temporary state, or there are some fundamental reasons why SSD won't ever be as fast as RAM?
  • Are B-Trees (and its cousins) still relevant?
  • If so, are there any adjustments or modifications of B-Trees (B+-Trees, R-Trees, etc.) made for SSD? If not, are there any other data structures crafted for SSD?
lollo
  • 139
  • 1
  • 5

6 Answers6

8

It is true that SSDs eliminate the seek time issue for reading, but writing efficiently on them is quite tricky. We have been doing some research into these issues while looking for the best way to use SSDs for the Acunu storage core.

You might find these interesting:

Gumbo
  • 643,351
  • 109
  • 780
  • 844
Irit Katriel
  • 3,534
  • 1
  • 16
  • 18
  • I'm afraid it was a mistake to trust Acunu to keep the log up. I don't have a copy. The gist of the second was an interesting experiment: – Irit Katriel Dec 31 '14 at 16:05
  • 1
    We write to an SSD until we fill it up several times over. But we do this in different ways. First, we write sequentially - this is fast. However, if we write to the addresses in random order, it is fast until we write the SSD's capacity but degrades sharply afterwards. Next, we try writing every other block (so we write a block, skip a block, write a block, skip a block). This is fast and stays fast. But if we toss a coin after each block to decide whether to skip a block or not, it degrades like a random write sequence. – Irit Katriel Dec 31 '14 at 16:14
  • 1
    In the final experiment we compute a random permutation of the blocks of the disk, and then write the disk in order of this permutation, several times. This is fast, like a sequential write order. We conclude that the issue for SSDs is predictability rather than seek-time (as it is with hard disks and the like), and we conjecture that the changes in translation from logical to physical addresses are what causes that slowdown. – Irit Katriel Dec 31 '14 at 16:18
  • 2
    I'm a year late, but here you go :) [Log file systems and SSDs -- made for each other?](https://web.archive.org/web/20120428011547/http://www.acunu.com/blogs/irit-katriel/theoretical-model-writes-ssds/) [Why theory fails for SSDs](https://web.archive.org/web/20120729224435/http://www.acunu.com/2/post/2011/08/why-theory-fails-for-ssds.html) – Casey Sep 25 '15 at 01:43
3
  • Current flash-based SSDs are not nearly as fast as main-memory DRAM. Will non-volatile memory technology eventually perform as well as DRAM? Someday. There's a lot of promising technologies under development.
  • One bottleneck in SSD performance is the SATA interface. As the technology improves, SSDs will be connected into the DRAM or PCIe bus.
  • B-trees are still relevant, as long as memory access is performed in blocks. Even DRAM is accessed in blocks, and popular blocks are cached in the CPU. Although difficult to implement, a B-tree designed to operate in DRAM can outperform other kinds of volatile search trees. The performance benefit will not likely be apparent until the tree has millions of entries in it, however.
  • B-trees implemented for SSDs benefit from improvements in block allocation. Current generation flash SSDs prefer sequentially ordered writes. As the B-tree grows (or changes), new blocks should be allocated in sequential order to get the best write performance. Log-based storage formats should do well, but I've not seen any implementations that scale. As the performance gap between sequentially and randomly ordered writes narrows, allocation order will become less important.
boneill
  • 1,478
  • 1
  • 11
  • 18
  • http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&DEPA=0&Order=BESTMATCH&Description=revodrive&x=0&y=0 – dariol Jul 25 '11 at 21:44
2
  1. RAM doesn't have to remember state after reset/reboot. I highly doubt SSD will ever be as fast as RAM.
  2. B-Trees are still very much relevant as you still try to minimize the disk reads.
Karoly Horvath
  • 94,607
  • 11
  • 117
  • 176
1

While the seek times of SSDs are better than those of HDDs by an order of magnitude or two, compared to RAM, these times are still significant. This means that issues related to seek times are not as bad, but they still are there. The throughput is still much lower than in RAM. Apart from the storage technology, the connections matter. RAM is physically very close to the CPU and other components on the motherboard and uses a special bus. Mass-storage devices don't have this advantage. There exist battery-backed packages of RAM modules which can act as an ultra-fast HDD substitute but if they attach via SATA, SCSI or other typical disk interface, the still are slower than system RAM.

This means that B-trees stil are significant and for high performance you still need to take care of what is in RAM and what is in permanent storage. Due to the whole architecture and physical limitations (non-volatile writes probably always will tend to be slower than volatile ones), I think this gap may become smaller but I doubt it will be completely gone in any foreseeable future. Even if you look at "RAM", you really don't have a single speed there, but several levels of faster and faster (but smaller and more expensive) caches. So at least some differences are there to stay.

Michał Kosmulski
  • 9,855
  • 1
  • 32
  • 51
1

One factor comes readily to mind...

There has been a growing trend towards treating hard drives as if they are tape drives, due to the high relative cost of making heads move between widely separated tracks. This has led to efforts to optimise data access patterns so that the head can move smoothly across the surface rather than seeking randomly.

SSDs practically eliminate the seek penalty, so we can go back to not worrying so much about the layout of data on disk. (More accurately, we have a different set of worries, due to wear-levelling concerns).

Marcelo Cantos
  • 181,030
  • 38
  • 327
  • 365
  • "SSDs practically eliminate the seek penalty,:" - no. They just minimize the seek time in order of 10..200 times. Seek time is still here (finding a translation from logical into physical). Write time is worse, because single rewrite block is bigger than on HDD (8-16kb vs 0.5-4kb) – osgx Aug 22 '11 at 16:46
0

I tested build time on SSD and RamDisk, SSD was a little faster. Same result was achived by my coworker with entirely different setup - build time on HDD was 9 minutes, on RamDisk 3min 30sec, on SSD 3min 0sec.

Meo
  • 12,020
  • 7
  • 45
  • 52
  • This doesn't make any sense. You must have ran out of ram? or have crappy ram? RamDisk can get over 9000MB read speeds (overclocked ram), SSD are at like 600MB right now. (write speeds are on the same scale, forget the numbers) – Farzher Jan 16 '13 at 22:49
  • Of course syntetic tests shows that RamDisk is much faster. That test was done on SATA1 or SATA2 and I have no explanation. The amount of available RAM was same, only in one case everything was done on RamDisk, and in the second case RamDisk was not used, but it still occupied memory. – Meo Feb 11 '13 at 20:39
  • After many tries I figured it out. When ramdisk is formatted as FAT32, then even though benchmarks shows high values, real world use is actually slower than NTFS formatted full encrypted SSD on sata 2 which goes max 100MB/s. But NTFS formatted ramdisk is faster in real life than SSD. – Meo Jun 07 '13 at 23:50