Given a .tar
archive, Matlab allows one to extract the contained files to disk via UNTAR
command. One can then manipulate the extracted files in the ordinary way.
Issue: When several files are stored in a tarball, they are stored contiguously on disk and, in principle, they can be accessed serially. When such files are extracted, this contiguity doesn't hold any more and the file access can become random, hence slow & inefficient.
This is especially critical when the considered files are many (thousands) and small.
My question: is there any way to access to the archived files avoiding the preliminary extraction (in a sort of HDF5 fashion)?
In other words, would it be possible to cache the .tar
so to access the contained files from the memory rather than from the disk?
(In general, direct .tar
manipulation is possible, e.g. is C# tar-cs, in python).