SSD vs. tmpfs speed

Question

I made a tmpfs filesystem in my home directory on Ubuntu using this command:

$ mount -t tmpfs -o size=1G,nr_inodes=10k,mode=0777 tmpfs space
$ df -h space .
File system                  Size    Used Avail. Avail% Mounted at
tmpfs                        1,0G    100M  925M   10%   /home/user/space
/dev/mapper/ubuntu--vg-root  914G    373G  495G   43%   /

Then I wrote this Python program:

#!/usr/bin/env python3

import time
import pickle


def f(fn):
    start = time.time()
    with open(fn, "rb") as fh:
        data = pickle.load(fh)
    end = time.time()
    print(str(end - start) + "s")
    return data


obj = list(map(str, range(10 * 1024 * 1024)))  # approx. 100M


def l(fn):
    with open(fn, "wb") as fh:
        pickle.dump(obj, fh)


print("Dump obj.pkl")
l("obj.pkl")
print("Dump space/obj.pkl")
l("space/obj.pkl")

_ = f("obj.pkl")
_ = f("space/obj.pkl")

The result:

Dump obj.pkl
Dump space/obj.pkl
0.6715312004089355s
0.6940639019012451s

I am confused about this result. Isn't the tmpfs a file system based on RAM and isn't RAM supposed to be notably faster than any hard disk, including SSDs?

Furthermore, I noticed that this program is using over 15GB of RAM when I increase the target file size to approx. 1 GB.

How can this be explained?

The background of this experiment is that I am trying to find alternative caching locations to the hard disk and Redis that are faster and available to multiple worker processes.

More of a discussion point than an answer; sorry about the formatting this inflicts. I created a tmpsfs using the same means as you (with the same name under my home, space). `$ time dd if=/dev/zero of=space/test.img bs=1048576 count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0231555 s, 4.5 GB/s real 0m0.030s user 0m0.000s sys 0m0.030s` — tink, Sep 25 '20 at 22:19
And to SSD: `$ time dd if=/dev/zero of=test.img bs=1048576 count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.165582 s, 633 MB/s real 0m0.178s user 0m0.000s sys 0m0.060s` — tink, Sep 25 '20 at 22:19
Could be python responsible for the time, not the FS/medium of choice. `0m0.030s` vs `0m0.178s` ... seems like a clear winner for tmpfs ... — tink, Sep 25 '20 at 22:20
@tink Yes, I can replicate your observations, so probably a Python issue. I would speculate, that maybe it's the reconstruction of the Python data structure that takes most of the time into account, so that the short read times do not alter the total time notably. — Green绿色, Sep 26 '20 at 03:50
@MarkSetchell Using `_pickle` instead of `pickle` does not make any difference to the final time measurements. A library called `cpickle` apparently does not exist in Python3. — Green绿色, Sep 26 '20 at 03:56

score 2 · Accepted Answer · answered Sep 26 '20 at 04:14

2

Answer flowing on from comments:

The time elapsed seems to be a python thing, rather than the media of choice.

In a similar set-up (SSD vs tmpfs) using OS commands on Linux the speed difference in writing a 100MB file is notable:

To tmpfs:

$ time dd if=/dev/zero of=space/test.img bs=1048576 count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.0231555 s, 4.5 GB/s

real    0m0.030s
user    0m0.000s
sys 0m0.030s

To SSD:

$ time dd if=/dev/zero of=test.img bs=1048576 count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.165582 s, 633 MB/s

real    0m0.178s
user    0m0.000s
sys 0m0.060s

answered Sep 26 '20 at 04:14

tink

14,342
4
46
50

You wrote `100 MiB`, not `100 MB` :) https://de.wikipedia.org/wiki/Byte#Vergleichstabelle – blueFast Oct 21 '21 at 03:55
@blueFast -und auch noch in Deutsch! Danke ;) – tink Oct 21 '21 at 08:02
Yepp, the german article has a nice table which the english version lacks! – blueFast Oct 21 '21 at 10:04

SSD vs. tmpfs speed

1 Answers1