Dear stack overflow community,
as promised here are the results of the experiments done based on your many suggestions.
A special thanks to @user894763 how put me on the "right path".
tl;dr use pnm files inside an uncompressed tar (yes I said pnm !).
I have done experiments on two high end machines, one enabled with SSD disks, and the other one using a networked file system. Both have high end CPUs, but show "two side of the spectrum" on disk access. Surprisingly, the conclusions are the same for both machines. I report only one set of results (for the later case). The ratios amongst file formats are almost identical in both experiments.
From these experiments I have learned two important things:
- When regarding files from disk, the operative system disk cache is king (i.e. the operative systems tries as much as possible to keep file operations in RAM instead of the physical device, and it does a really good job at this).
- Contrary to my initial guess, reading images from disk is a CPU bounded operation, not an I/O bounded.
Experiment protocol
I am reading a set of ~1200 images in a fix sequence, no computation is done on the images, I am simply measuring the time to load the pixels in memory. The tar files sizes are ~600 MB in pnm format, ~300 MB in png format, and ~200 MB in webp format.
"Fresh read" means first read done on the machine.
"Cached read" means the second read done on the same machine (and any subsequent one).
All numbers are roughly +- 10 Hz.
webp fresh read: 30 Hz
webp cached read: 80 Hz
webp + tar fresh read: 100 Hz
webp + tar cached read: 100 Hz
png fresh read: 50 Hz
png cached read: 165 Hz
png + tar fresh read: 200 Hz
png + tar cached read: 200 Hz
pnm fresh read: 50 Hz
pnm cached read: 600 Hz
pnm + tar fresh read: 200 Hz
pnm + tar cached read: 2300 Hz
Notes
I was told that maybe there is way to change the webp compression parameters to make the decompression faster. I suspect that it would still not match the pnm performance.
Please note that I used custom code to read the images in the tar file, the file is read from disk "image by image".
I do not know why reading the webp images "fresh" was slower than the png ones, I can only speculate that the networked disk system had some "internal" cache that somewhat changed the behaviour. However this does not affect the lessons.
Lessons
If you will read a file (or a set of files) multiple times, the operative system disk cache will make all future reads essentially "as fast as reading from RAM".
Even when reading from disk the time to decompress images is non-negligible.
Putting all the files into single uncompressed (tar) file, makes things significantly faster because the operative system will assume that the whole file will be read, pre-loading future images even before we access them. This seem not to happen when simply reading inside a folder.
With proper care, a factor 4x ~ x10 in speed-up can be obtained when reading a sequence of images from disk (specially if read repeatedly).