h5py object dereferencing ultra slow?

Asked Aug 12 '18 at 17:58

Active Aug 12 '18 at 19:30

Viewed 106 times

I'm trying to read the SVHN dataset (http://ufldl.stanford.edu/housenumbers/) [the full version]. It's hdf5, so I tried to use h5py (since pandas takes a while to read it).

Thus, I tried the method described in https://stackoverflow.com/a/41579641/1745291 but on my system (Archlinux latest, h5py 2.8.0, hdf5 1.10.2-3, Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz), it is SUPER SLOW : >30s to read a single filename...

Is it a bug on this version ? Is it the expected access time ? (would be hard to believe since this format is reputed for this)... Am I doing something wrong ?

...Note : I also found this thread with no responses : https://groups.google.com/forum/#!topic/h5py/4eHydpsQ1HU

edited Aug 12 '18 at 19:30

asked Aug 12 '18 at 17:58

hl037_

3,520
1
27
58

That's an extremely large file, isn't it? – hpaulj Aug 12 '18 at 18:29
not that much, only 228M (it's the digitStruct.mat from train.tar.gz) – hl037_ Aug 12 '18 at 18:34

h5py object dereferencing ultra slow?

0 Answers0